Disk cache deduplication method based on mixed page

A disk cache and page technology, which is applied in the field of disk cache deduplication based on mixed pages, can solve problems such as inability to effectively identify duplicate data, and achieve the effects of increasing effective capacity, maximizing hit rate, and improving hit rate

Active Publication Date: 2019-10-11
JINAN UNIVERSITY
View PDF9 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the process of realizing the present invention, the inventor found that there are at least the following technical problems in the prior art: the traditional disk cache system based on the LRU replacement algorithm cannot effectively identify duplicate data in the cache, resulting in a large amount of redundant data in the cache

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Disk cache deduplication method based on mixed page
  • Disk cache deduplication method based on mixed page

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0035] Such as figure 1 , figure 2 As shown, this embodiment discloses a disk cache deduplication method based on hybrid pages. The method reduces the problem of deduplication rate in order to avoid expanding the cached page and increasing the hit rate. The base page and the huge page are kept in the cache at the same time. And to monitor its hot and cold degrees in real time, split the cold huge pages with high repetition rate into base pages to increase the deduplication rate, or reconstruct the split hot base pages into huge pages to expand the average page size of the cache to increase hits rate.

[0036] The present invention introduces a hybrid page mechanism into the traditional disk cache, and combines the hybrid page mechanism with the data deduplication technology, and is mainly divided into four steps:

[0037] 1) Huge page generation. The huge page generator merges consecutive base pages according to the initial page address of the application to generate the corresp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a disk cache deduplication method based on a mixed page. According to a traditional LRU-based disk cache, repeated data blocks with the same content in the cache cannot be identified, so that certain redundant data exists in the cache, meanwhile, the traditional disk cache is based on a fixed page size, the page size is also an important factor influencing the cache hit rate, and the optimal page size can maximize the cache hit rate. According to the disk cache deduplication method based on the mixed page, a mixed page mechanism is introduced into the disk cache, a hugepage is added while a base page is reserved, and the size of the huge page is adjusted in a self-adaptive mode so that the hit rate can be maximized. Meanwhile, the cold and hot degrees of the base pages and the huge pages are monitored, the cold huge pages with the high repetition rate are split into the base pages or the split hot base pages are reconstructed into the huge pages, and dynamic conversion of the base pages and the huge pages is achieved. Duplicate removal processing is conducted on the base page and the huge page through the duplicate data deletion technology, and the duplicate removal rate is kept while the hit rate is maximized.

Description

Technical field [0001] The invention relates to the technical field of disk caching under a hybrid page mechanism, and in particular to a method for deduplication of a disk cache based on a hybrid page. Background technique [0002] Data deduplication technology is the main technical means to eliminate cached redundant data. It performs deduplication detection on the cache space, filters out identical data blocks and deletes them, and only keeps the only copy in the cache, thereby eliminating redundant data in the cache and saving cache space. The deduplication technology can be divided into byte-level deduplication, block-level deduplication, and file-level deduplication according to the deduplication granularity. The byte-level deduplication mainly uses Delta encoding to identify duplicate data, while the block-level and file-level are mainly used. It uses hash algorithms (such as MD5 and SHA-1) to identify corresponding duplicate data blocks. The deduplication process mainly...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06G06F12/123
CPCG06F3/0641G06F12/123G06F12/0871Y02D10/00
Inventor 邓玉辉斯雷
Owner JINAN UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products