Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system

An optimization method and indexing technology, applied in the field of data processing, can solve problems such as excessive operations, reduced database throughput, and high consumption, and achieve the effects of reducing CPU consumption, avoiding memory overhead, and avoiding performance degradation

Active Publication Date: 2015-07-29
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF5 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The inventor found through research that for the scenario where the same Key data item is frequently updated in the LSM-tree index structure, the existing processing mechanism will cause excessive CPU (Central Processing Unit, central processing unit) consumption and reduce the external throughput of the database. , I / O (input / output, input and output) too many operations and technical problems such as the introduction of memory cache

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system
  • LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system
  • LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0034] figure 2 It is a flow chart of an LSM-tree index optimization method provided in the first embodiment of the present invention. The method of this embodiment can be executed by an LSM-tree index optimization device, which can be implemented by means of hardware and / or software Realized, and generally can be integrated in a server for storing massive LSM-tree index structure data, wherein, the server can be a single server, or a cluster server composed of multiple servers. Since the embodiment of the present invention has an obvious optimization effect for the scenario where the same Key data item is frequently updated, the server may preferably be a search engine server.

[0035] The method of this embodiment specifically includes:

[0036] 210. Store the data written in the memory according to the LSM-tree memory index structure.

[0037]In the traditional LSM-tree index structure, the data in the memory is usually stored in the form of a balanced tree or skip table...

no. 2 example

[0056] image 3 It is a flowchart of an LSM-tree index optimization method according to the second embodiment of the present invention. This embodiment is optimized on the basis of the above embodiments. In this embodiment, the data written in the memory is stored according to the LSM-tree memory index structure. The specific optimization is: when there is data written into the memory, the The data is stored in the bottom layer of the LSM-tree memory index structure, and the bottom layer is used as the current operation layer; according to the position of the current operation layer in the LSM-tree memory index structure, the acquisition and the current The merge threshold condition corresponding to the operation layer; when the data stored in the current operation layer meets the merge threshold condition, according to the second merge algorithm, the data stored in the current operation layer is merged; the merge result is stored in In the layer above the current operation l...

no. 3 example

[0074] Figure 4 It is a flow chart of an LSM-tree index optimization method according to the third embodiment of the present invention. This embodiment is optimized based on the above embodiments. In this embodiment, the LSM-tree memory index structure is specifically optimized to include three layers, which are respectively the zeroth layer, the first layer and the second layer from bottom to top. ;

[0075] Specifically optimizing the merging threshold condition of the zeroth layer is that the size of the data stored in the zeroth layer is greater than the first data threshold;

[0076] Specifically optimizing the merging threshold condition of the first layer is that the number of data blocks stored in the first layer is greater than the data block threshold.

[0077] Correspondingly, the method in this embodiment specifically includes:

[0078] 410. When data is written into the memory, store the data in the zeroth layer of the LSM-tree memory index structure.

[0079...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses an LSM-tree (The Log-Structured Merge-Tree) index optimization method and an LSM-tree index optimization system. The method includes the following steps: data written into a memory are stored according to an LSM-tree memory index structure; when the memory data which are stored on the basis of the memory index structure meet a writing threshold condition, according to a first merging algorithm, the memory data are merged; the merged memory data are written into a disk file according to an LSM-tree disk index structure. The technical scheme of the invention solves the technical problems of excessive CPU (central processing unit) consumption, decrease in the outward throughput of a database, excessive I / O (input / output) operation, memory cache usage and the like exiting in the prior art which are caused in the scenario that same Key data items are updated frequently, consequently, CPU consumption is reduced remarkably, the outward throughput of the database is increased, and moreover, the problems of extra memory overhead, performance degradation and the like caused by the usage of a memory cache are prevented.

Description

technical field [0001] The embodiments of the present invention relate to data processing technology, and in particular to a method and device for optimizing an LSM-tree index. Background technique [0002] LSM-tree (The Log-Structured Merge-Tree, log-structured merge tree) is the underlying index structure of many modern databases. The basic idea of ​​this index structure is to store the data change operation in the memory first, and when the data in the memory After the scale reaches a threshold, the data in the memory is written to the disk in batches and in an orderly manner, and the newly placed data and the old data are rolled and merged through a certain layered sorting algorithm. Wherein, in the LSM-tree index structure, data is stored in the form of key-value (Key-Value) pairs. Each key (Key) stores a corresponding value (Value), and the corresponding data value (also called a data item) can be found through the key name, and then certain operations can be performe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 孙君意覃安颜世光李康徐佩林
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products