Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Parallel Data Difference Method

A differential method and data technology, applied in the field of computer information, can solve problems such as memory consumption, and achieve the effects of improving processing speed, reducing execution time, and speeding up differential speed.

Active Publication Date: 2020-05-22
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The time complexity of the algorithm is O((n+m)lgn), where n represents the size of the source file and m represents the size of the target file; the time complexity of restoring the target file is O(n+m), although the system processing speed Very fast, but the Bsdiff algorithm is very memory-intensive. The Bsdiff algorithm requires max(17*n, 9*n+m)+O(1) bytes of memory at most, where n is the size of the source file and m is the size of the target file

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Parallel Data Difference Method
  • A Parallel Data Difference Method
  • A Parallel Data Difference Method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.

[0040] The invention uses the idea of ​​parallelism to improve the Bsdiff algorithm and realizes a parallel data difference method. Good performance in time and no additional space consumption.

[0041] The idea of ​​parallelism is to use many-core technology to divide the original single-threaded execution program into multiple threads for simultaneous execution, so as to achieve the purpose of acceleration. The parallel idea as Figure 4 As shown, the target file is divided into n parts, and each thread is processed separately to generate respective patch files, and then each patch file is merged to form a difference file (patch file). The present invention takes advantage of this approach to reduce the time to generate differential data.

[0042] This m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a parallel data difference method. The parallel data difference method comprises the steps that 1, files are preprocessed, wherein a source file and a target file are initialized, a suffix array of the source file is generated, and a patch file is created and initialized; 2, the target file is segmented, wherein the target file is segmented according to the thread number, and one thread is added for each part of segmented target file to perform independent processing; 3, the thread processing process is performed, the segmented target file is initialized in each thread, patch files are created, the source file and the target file are compared through the suffix array to generate differential data, and the differential data is written into the patch file; 4, host processes are merged and process, wherein the patch files with the written differential data of the threads are written into the patch files together. By adopting a multi-thread paralleling technology, the patch generating speed is improved.

Description

technical field [0001] The invention relates to the field of computer information technology, in particular to a parallel data difference method. Background technique [0002] With the advent of the Internet era, the total amount of data is increasing rapidly, and data compression plays an important role in data transmission and storage. Data difference is also a compression technology, which uses the difference between the source file and the target file to realize the compression and decompression of the target file. Data difference refers to comparing the difference between the source data and the target data, and generating a difference data patch. At the same time, the target data can be restored by using the differential data patch and the source data, thereby helping to reduce resource consumption such as disk or bandwidth. Data difference technology is mainly used in data processing with contrasting nature, such as software update, data transmission and data backup...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/174G06F9/38
CPCG06F9/3867G06F16/1744
Inventor 刘燕兵卢毓海王歧张春燕谭建龙郭莉
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products