Data efficient deletion method and system based on source end deduplication

A deletion method and source-side technology, which is applied to the redundancy in the operation for data error detection, electrical digital data processing, special data processing applications, etc. It can solve the problems of complex logic, low efficiency, and inability to release space quickly and efficiently. , to achieve the effect of simple statistical logic

Pending Publication Date: 2020-05-12
NANJING UNARY INFORMATION TECH
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The technical problem to be solved by the present invention is to overcome the problem that in the existing source-side deduplication technology, due to the uniqueness of the deduplicated data, the logic of the deletion operation is relatively complicated, the efficiency is relatively low, and the space cannot be released quickly and efficiently. Defects, providing an efficient data deletion method and system based on source-side deduplication

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data efficient deletion method and system based on source end deduplication
  • Data efficient deletion method and system based on source end deduplication
  • Data efficient deletion method and system based on source end deduplication

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention will be further described below in conjunction with the accompanying drawings. The following examples are only used to illustrate the technical solution of the present invention more clearly, but not to limit the protection scope of the present invention.

[0038] The deletion logic used in the present invention is to use the label of the container to distinguish which data blocks and their fingerprints in the container can be cleaned up, and which ones are still in use, so the key is how to quickly mark and delete the containers in the deduplication library. Fingerprints are simple, you only need to read out the container that needs to be deleted and parse out the stored fingerprints, and then delete the corresponding records in the fingerprint database. If you mark it every time you clean it up, it will take a long time to search the fingerprints in the index files of all objects in the object library to determine which containers are still in us...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an efficient data deletion method based on source end deduplication. In the backup process, a data stream of a source end is segmented into data blocks, fingerprints are calculated, the fingerprints are compared, if the fingerprints do not exist, it is indicated that the data stream is a new block, the corresponding data blocks are transmitted to a container of a server tobe stored, the corresponding container is marked as 1, the container is written into a data file after the container is full, and then a new container is created; automatically clearing up the expiredbackup set, and clearing up the guid object record; and performing data block and fingerprint cleaning on the container marked as 0 by utilizing a preset cyclic deletion logic in idle time beyond thenormal service window period, wherein the container marked as 0 indicates that the data block and the fingerprint in the container can be cleaned when not being referenced. The method has the advantages that a marking mode is adopted, the statistical logic is simpler, the cleaning logic is not influenced by the size of the deduplication library, and the method is more efficient.

Description

technical field [0001] The invention relates to an efficient data deletion method and system based on source-side deduplication, and belongs to the technical field of data protection. Background technique [0002] Source-side deduplication is widely used in data protection products due to its ability to reduce transmission bandwidth and storage space. For the convenience of explanation, it is agreed here that the data after deduplication at the source is stored in the deduplication database. The deduplication database includes the deduplication fingerprint database and the deduplication database. The index information of the data block is stored in the deduplication fingerprint library, and the data block is stored in the deduplication database. The data after source-side deduplication has the following characteristics: the data blocks stored in the deduplication database are unique in the whole database, and most of the data blocks in the deduplication database will be sha...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215G06F11/14
CPCG06F16/215G06F11/1453
Inventor 周建华张有成姚崎丁红李海鹏许萍萍
Owner NANJING UNARY INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products