Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data deduplication method and device

A technology for deduplication and fingerprint data, applied in special data processing applications, memory address/allocation/relocation, etc., can solve the problems of large amount of calculation and consumption of resources, low deduplication performance, etc. The effect of huge consumption, reduction of calculation amount, and reduction of query calculation amount

Active Publication Date: 2016-06-08
HUAWEI TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention provides a method and device for deduplication, which solves the problem of low deduplication performance in the prior art due to the huge amount of calculation and resource consumption required for deduplication.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication method and device
  • Data deduplication method and device
  • Data deduplication method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0029] image 3 It is a flow chart of Embodiment 1 of the deduplication method of the present invention, such as image 3 As shown, this embodiment provides a method for deduplication, which may specifically include the following steps:

[0030] Step 301, perform block processing on the file to be stored, and calculate the fingerprint of each bl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Provided is a duplicate data deletion method and device. The method includes: partitioning a file to be stored, and calculating a fingerprint of each partition in the partitioning processing result; sampling the fingerprint of each partition, and generating a fingerprint sampling table for the file to be stored according to the sampled fingerprint; determining a similar grouping of the file to be stored in a grouping sampling library according to the fingerprint sampling table and the grouping sampling library; and performing duplicate data deletion on the file to be stored according to the fingerprint data in a fingerprint grouping corresponding to the similar grouping in a fingerprint library. The device includes: a partitioning module, a sampling module, a grouping module and a duplicate data deletion module. The present invention solves the problem in the prior art that the calculation amount and the resource consumption introduced by massive partitioned data during duplicate deletion are huge and reduces the calculation amount of de-duplication during duplicate data deletion.

Description

technical field [0001] The present invention relates to the technical field of data storage, in particular to a method and device for deduplicating data. Background technique [0002] Data deduplication (abbreviated as deduplication) is a data reduction technology, which is usually used in disk-based backup systems, and aims to reduce the storage capacity used in the storage system. Generally, data deduplication technologies are used in scenarios with large amounts of data. The deduplication technologies in the industry mainly include technologies such as block detection, similarity detection, and Delta coding. A data compression method, but only detects similar files, and the deduplication rate is low. The deduplication rate is an important indicator to measure the effect of deduplication, and identifies the ratio of deduplication data. [0003] figure 1 It is a schematic diagram of the process of data deduplication technology in the prior art, such as figure 1 As shown...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F12/06
CPCG06F16/1748
Inventor 付旭东徐君
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products