Method and device for duplicate removal of mirror image file

A technology of mirroring files and files, which is applied in the input/output process of electrical components, computer parts, and data processing, etc., can solve the problem of affecting storage usage efficiency, redundant data in storage systems, and unsuitable for virtual machine image file deduplication. and other problems to achieve the effect of enhancing the deduplication effect

Active Publication Date: 2021-03-26
BEIJING TOPSEC NETWORK SECURITY TECH +2
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, most mirror files have the same data, and each data is saved in its own mirror file, resulting in too much redundant data in the storage system, which seriously affects the efficiency of storage usage.
The current deduplication technology is basically aimed at the data block level, and the file level is basically small files, and the files must be exactly the same to be deduplicated. Obviously, the probability of a large file such as a virtual machine image file being exactly the same is very low. Therefore, the deduplication scheme in the prior art is not suitable for deduplication of large files such as virtual machine image files. Therefore, it is urgent to provide a method for deduplication of image files to realize deduplication of incomplete image files. Heavy, to enhance the weight removal effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for duplicate removal of mirror image file
  • Method and device for duplicate removal of mirror image file
  • Method and device for duplicate removal of mirror image file

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] Various aspects and features of the present application are described herein with reference to the accompanying drawings.

[0052] It should be understood that various modifications may be made to the embodiments applied for herein. Accordingly, the above description should not be viewed as limiting, but only as exemplifications of embodiments. Those skilled in the art will envision other modifications within the scope and spirit of the application.

[0053] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and, together with the general description of the application given above and the detailed description of the embodiments given below, serve to explain the embodiments of the application. principle.

[0054] These and other characteristics of the present application will become apparent from the following description of preferred forms of embodiment given as non-limiting examp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for duplicate removal of mirror image files. The method comprises the following steps: receiving a trigger event for duplicate removal of the mirror imagefiles; determining a feature array of a target mirror image file in the mirror image files; calculating the similarity between each mirror image file and other mirror image files according to the feature array of the target mirror image file; determining other mirror image files of which the similarity with the target mirror image file is greater than a preset value; and combining the same data of the target mirror image file and the other mirror image files of which the similarity with the target mirror image file is greater than a preset value to form a parent mirror image file. By adoptingthe scheme provided by the invention, the same data between the mirror image files can be merged under the condition that the similarity between the mirror image files is greater than the preset value, and the mirror image files do not need to be completely the same, so that the incompletely the same mirror image files can be subjected to duplicate removal, and the duplicate removal effect is improved.

Description

technical field [0001] The present application relates to the field of distributed storage, in particular to a method and device for deduplication of image files. Background technique [0002] With the advent of the big data era, the importance of deduplication technology is becoming more and more obvious. Deduplication refers to the deletion of duplicate data. Through deduplication, the number of storage media required can be reduced, thereby reducing costs. It also allows hard disk-based storage systems to cost less than tape libraries while providing better performance. If data is deduplicated when writing data, part of the data can be avoided from being written to the disk, thereby improving the writing performance. If data deduplication is performed on the client, only newly added data is transmitted to the storage system, which can also reduce the amount of data transmission on the network, thereby saving network bandwidth. [0003] In the existing hyper-converged sy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/08G06F3/06G06K9/62H04L29/06
CPCH04L67/1095G06F3/0608G06F3/064H04L67/133G06F18/22
Inventor 陈仲涛
Owner BEIJING TOPSEC NETWORK SECURITY TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products