A write amplification optimization method based on flow control for tree-like storage structure

A flow control and tree-like storage technology, applied in hardware monitoring, input/output to record carrier, etc., can solve problems such as large write amplification and excessive component data volume, reduce write amplification, improve overall performance, and write throughput volume-boosting effect

Active Publication Date: 2018-05-22
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the existing LSM-Tree structure, the data volume of a certain component is often too large, and once this happens, when it participates in the merger as a lower-level component, it will cause a large write amplification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A write amplification optimization method based on flow control for tree-like storage structure
  • A write amplification optimization method based on flow control for tree-like storage structure

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0033] Example 1 LSM-Tree write amplification optimization method based on flow control

[0034] The present invention is implemented and tested based on RocksDB. The specific implementation method is as follows:

[0035] 1) Feedback stage

[0036] Feedback is that before each RocksDB merge operation, each component counts its current data volume and reports to the listener. Each component has metadata, which mainly records its own basic information such as the number of files, data volume, and key value range, so the current data volume can be obtained directly from the metadata.

[0037] 2) Calculation stage

[0038] Calculation refers to judging whether flow control is required, and calculating the strength of flow control. After the data volume of each component is obtained through the feedback stage, it is judged whether the data volume of each component exceeds the threshold. If the data volume of all components does not exceed the corresponding threshold, no flow c...

example 2

[0041] Example 2 LSM-Tree write amplification optimization method based on flow control

[0042] When the present invention is applied, only the data volume of one or several components can be monitored according to the actual situation. For the second component, hereinafter referred to as C 2 , to monitor:

[0043] 1) Feedback stage

[0044] Before each RocksDB merge operation, count C 2 The data volume of the component. The statistical method is mainly by traversing the C 2 Component metadata, read in C 2 The size of each file above, add them up to get C 2 The current total amount of data of the component.

[0045] 2) Calculation stage

[0046] C through feedback 2 After the amount of data, judge C 2 Does the amount of data exceed that of C 2 If the threshold value is not exceeded, no flow control is performed; otherwise, flow control is required, and the flow control strength is calculated by the following formula: flow control strength N=0.5n 2 +0.5n, where n=C...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a flow control-based write amplification optimization method for a tree-like storage structure. The method: 1) a flow listener is set in the storage system; wherein, the disk space and the memory space of the storage system adopt a tree-like storage structure for data storage; The current amount of data, determine whether the disk space needs flow control; if it needs to be controlled, then select the current data amount of some components to be monitored to calculate a flow control strength; 3) the flow listener calculates an extension according to the flow monitoring strength time, and then extend the inter-arrival time of write requests according to the extended time. The present invention can effectively reduce write amplification and improve the overall performance of the LSM-Tree; optimize the LSM-Tree by using the method provided by the present invention, and can increase the overall write throughput of the LSM-Tree by more than 30%.

Description

technical field [0001] The invention belongs to the technical field of computer software and relates to a flow control-based tree-like storage structure LSM-Tree write amplification optimization method. Background technique [0002] LSM Tree is a multi-component tree storage structure. Generally speaking, LSM Tree consists of memory space and disk space. The data will first be cached in the memory space, and when the memory space reaches a certain threshold, the data in the memory will be flashed to the disk space in batches. Disk space is composed of multi-layer components, and each layer of components has a threshold for storing data size, and the threshold increases exponentially from top to bottom. The data that has just been flushed from the memory to the disk will be stored in the upper-level component first. When the component reaches the threshold, the data in this layer will be merged into the lower-level component through the merge operation. [0003] Write ampl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F3/06G06F11/30
Inventor 岳银亮李宇哲王伟平
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products