Two-stage single-instance data de-duplication backup method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A data backup and single-instance technology, which is applied in the direction of electric digital data processing, special data processing applications, redundant data error detection in computing, etc., can solve the problem of heavy client workload, waste of time and bandwidth, and reduce query Speed and other issues

Inactive Publication Date: 2013-09-25

XI AN JIAOTONG UNIV

View PDF3 Cites 50 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In order to deal with this problem, the most commonly used method is to implement file-level deduplication technology or block-level deduplication technology on the server side. These two methods have many disadvantages. First, simply using file-level deduplication technology cannot achieve Very good deduplication effect, especially for some files with similar content and small differences, and cannot detect duplicate data between files

Second, for block-level deduplication technology, the client needs to upload a large amount of metadata information to the server, and the server can detect duplicate data. Both the server and the client need to process these data in real time, wasting time and bandwidth, and the work of the client a lot

The third is that file-level deduplication detection is to query all file information, without considering the necessary conditions when various files are the same. Block-level deduplication is to uniformly divide all files into blocks, and then use the Query, which will not only make the metadata scale very large, but also reduce the query rate

Fourth, the traditional block-level block technology is easy to disperse and store the continuous data blocks originally in the same file, and the restoration speed is very slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0068] figure 1 Shown is the deployment implementation environment of this method. First, the deployment environment of this method is a C / S structure, including client and server. Local logs are saved on the client, and the logs record information and backups of files that users have saved. information about the task. The client interacts with the server through the network. The server side includes a backup server and a background processing system. The backup server saves the content of the backup file in a storage medium, and saves the metadata of the backup file into a metadata file. While the background processing system performs similar file classification and deduplication operations on files when the backup server is light or has no tasks, and deduplicates the files twice.

[0069] figure 2 Shown is the overall architecture diagram of this method, including three parts: client, backup server, and background processing system. The client processes local files. And...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a two-stage single-instance data de-duplication backup method. De-duplication of data at two stages is performed during backup. The method comprises firstly, performing repeated data detection on a file scale, inquiring local logs to judge whether identical files are stored or not, and if the identical files are stored, informing users to complete the backup operation; if identical files are not stored locally, informing backup programs of server ends to inquire databases to judge whether files with identical content exist or not, if the files with identical content are searched, establishing links pointing to the files for clients only, and recording quotes of the files by clients by the server ends; if the files are new, uploading the files and recording information of the files by two ends; further processing the files after the files are uploaded to the server ends by background programs, and splicing small files together to avoid waste of space; storing large files respectively by type, comparing similar files regularly, and performing difference de-duplication at the second stage after grouping.

Description

technical field [0001] The invention relates to the technical field of computer storage, and in particular aims at providing a method for eliminating redundant data and saving network bandwidth when a client backs up its own files to a server, so as to improve the availability of storage devices. Background technique [0002] In a general environment where the client saves its own files to the server, the server only accepts the files uploaded by the client without performing too many specific checks on the files, and the client does not have any identification of the uploaded files. In a general application environment, when multiple clients upload files to the server, it often happens that multiple users back up the same file, or a single user backs up several consecutive versions of files with similar content. In this case, a large amount of redundant data will be generated. [0003] In order to deal with this problem, the most commonly used method is to implement file-l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F11/14G06F17/30

Inventor 张兴军朱跃光董小社朱国峰王龙翔姜晓夏

Owner XI AN JIAOTONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Two-stage single-instance data de-duplication backup method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology