Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Key value separation storage method and system for delayed garbage collection based on LSM tree

A separate storage and LSM tree technology, applied in database distribution/replication, special data processing applications, instruments, etc., can solve the problems of low frequency of LSM tree merging, statistical information lag, consumption performance, etc., to reduce query delay and optimize the overall Efficiency, reducing the effect of small resources

Pending Publication Date: 2021-11-09
上海沄熹科技有限公司
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

BadgerDB is suitable for scenarios with large values, and performs poorly for scenarios with small values
After adopting the key-value separation strategy, the sequential search becomes a random read of the disk, which limits the read performance of BadgerDB to a certain extent.
At the same time, after the key value is separated, the deleted value needs to be cleaned up separately. During the cleaning process, the LSM tree needs to be queried according to the key to determine whether to retain the value. Therefore, BadgerDB's garbage collection (Garbage Collection, GC for short) consumes a lot of performance
The GC process of BadgerDB depends to a certain extent on the statistics of the deleted data in the valueLog file during the LSM tree merging process. In the scenario where the key value is large (64KB or more), the LSM tree merging frequency is low (if the merging frequency is high, it will cause higher write amplification) causes the statistical information of deleted data in the valueLog file to lag behind, which in turn affects the efficiency of BadgerDBGC;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Key value separation storage method and system for delayed garbage collection based on LSM tree
  • Key value separation storage method and system for delayed garbage collection based on LSM tree
  • Key value separation storage method and system for delayed garbage collection based on LSM tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] A key-value separation storage method based on delayed garbage collection of an LSM tree, the specific steps of the method are as follows:

[0038] The upper layer application of S1 encapsulates key-value data into key-value entries, and writes key-value entries into key-value files according to logical blocks for persistent storage;

[0039] S2 queries the jump table in the memory and the sorted string table at level 0 in the index module;

[0040] The upper layer application of S3 calls the garbage collection interface of the key-value storage system for separate storage;

[0041] The writing process of the key-value storage system in the method embodiment of the present invention is as follows: figure 1 As shown, according to the S1 upper-layer application calling the write operation interface of the key-value storage system, the written key-value data is encapsulated into a key-value entry, which contains the index key, value, and metadata that may be queried. The...

Embodiment 2

[0063] A key-value separation storage system based on LSM tree-based delayed garbage collection, the system specifically includes an encapsulation storage module, a query sorting module and a separation storage module:

[0064] Encapsulation storage module: the upper layer application encapsulates key-value data into key-value entries, and writes key-value entries into key-value files according to logical blocks for persistent storage;

[0065] Query sorting module: query the jump table in memory and the sorting string table at layer 0 in the index module;

[0066] Separate storage module: the upper application calls the garbage collection interface of the key-value storage system for separate storage;

[0067] Further, the query sorting module specifically includes a first query module and a second query module:

[0068] The first query module: directly returns the required metadata information after querying the corresponding index entry;

[0069] The second query module: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a key value separation storage method and system for delayed garbage collection based on an LSM tree, and belongs to the field of computer storage. The method comprises the following specific steps: S1, packaging key value data into key value entries by an upper-layer application, and writing the key value entries into a key value file according to logic blocks for persistent storage; S2, querying a skip list in a memory and a 0-layer sorting character string list in an index module; and S3, enabling the upper-layer application to call a garbage collection interface of the key value storage system for separated storage. According to the method and system, the LSM tree is optimized by adopting a key value separation idea, the query delay is reduced by virtue of an index module, the garbage collection efficiency is improved by adopting a statistical preposition means, and unnecessary resources in a service peak period are reduced by utilizing a delayed garbage collection mechanism; the integrity of the key value storage system in the scene of frequent random reading in the processes of batch writing, rapid deletion and transient storage is improved, and the overall efficiency of the consensus module of the distributed storage system is optimized.

Description

technical field [0001] The invention discloses a key-value separation storage method and system based on delayed garbage collection of an LSM tree, and relates to the technical field of computer storage. Background technique [0002] As a kind of IT basic software for storing data, the traditional stand-alone database is increasingly limited by poor scalability, smooth performance curve and high investment in the era of massive data, and distributed databases emerge as the times require. Distributed databases break the performance bottleneck of traditional stand-alone databases, can be deployed on some relatively cheap machines, and have linear scalability in theory, completely eliminating users' concerns that databases cannot support rapid business development. Distributed databases use computer networks to connect multiple computing and storage units scattered in physical space into a logically integrated database system, which has dispersion in physical space, integrity i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/22G06F16/215G06F16/27G06F16/2453
CPCG06F16/2246G06F16/215G06F16/27G06F16/2453G06F16/2282
Inventor 李涛管延信王瀚墨
Owner 上海沄熹科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products