Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Differential Caching Method for Online Primary Storage Deduplication

A differentiated and primary storage technology, applied in the field of storage systems, can solve problems such as large space occupation, increased shadow cache overhead, and reduced efficiency of LIRS and ARC caches, so as to improve the hit rate, increase the probability of fingerprint cache swapping out, and reduce Effect of Fingerprint Swapout Probability

Active Publication Date: 2022-07-19
NAT UNIV OF DEFENSE TECH
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the deduplication fingerprint cache, the index is the fingerprint, and the content is the physical block number. The former takes up more space than the latter, which significantly increases the overhead of maintaining shadow caches and reduces the efficiency of LIRS and ARC caches.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Differential Caching Method for Online Primary Storage Deduplication
  • A Differential Caching Method for Online Primary Storage Deduplication
  • A Differential Caching Method for Online Primary Storage Deduplication

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] like image 3 As shown, the differential caching method for online primary storage deduplication in this embodiment includes:

[0040] 1) After receiving the write I / O request, the data is divided into different data streams according to the source, the data is divided into data blocks, and the fingerprint of each data block is calculated;

[0041] 2) Perform fingerprint sampling on different data streams. If the sampling time of a data stream is up, the locality prediction of the data stream is performed to obtain the number of non-redundant data blocks in the data stream, and according to the number of non-redundant data blocks Adjust the fingerprint cache replacement probability of the data stream, such as Figure 4 shown; select a data block as the current data block;

[0042] 3) According to the fingerprint of the current data block, query the fingerprint index table to determine whether there is a matching item, and the fingerprint index table records the mappin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a differentiated caching method for online main storage deduplication. Fingerprint of each data block; fingerprint sampling is performed on different data streams, and if the sampling time of a data stream is up, local prediction is performed on the data stream to obtain the number of non-redundant data blocks in the data stream, and the number of non-redundant data blocks in the data stream is obtained according to the non-redundant data stream. The number of remaining data blocks adjusts the fingerprint cache replacement probability of the data stream; each data block is being processed, and when the cache is full and needs to be replaced, those redundant blocks are preferentially replaced according to the fingerprint cache replacement probability of different data streams. The fingerprint of the data block in the difference data stream.

Description

technical field [0001] The invention relates to the field of storage systems, in particular to a differentiated caching method for online main storage deduplication. The differentiated caching technology improves the efficiency of fingerprint indexing in main storage deduplication, thereby improving the online main storage deduplication rate and reducing I / O delay caused by deduplication logic and prolong the service life of SSD hard disk. Background technique [0002] Redundant data widely exists in the main storage system in cloud computing scenarios. On the one hand, these redundant data waste valuable storage capacity, and on the other hand, it also brings unnecessary loss of storage performance. Therefore, it is important to eliminate redundant data in primary storage. Data deduplication technology divides data into multiple small blocks, obtains the fingerprint of each small block through a hash algorithm, and uses these fingerprints to identify and reduce redundant d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F3/06
CPCG06F3/061G06F3/0616G06F3/064G06F3/0679G06F3/06
Inventor 邬会军卢凯王睿伯董勇张伟周恩强迟万庆谢旻张文喆李佳鑫吴振伟
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products