Data deduplication method based on internet small computer system interface (iSCSI)

A technology for deduplication and old data, applied in electrical components, transmission systems, etc., can solve the problems of high computing and storage overhead, high overhead, and high false positive rate, and achieve the effect of reducing bandwidth and transmission delay

Inactive Publication Date: 2011-09-14
BEIJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The shingle algorithm for similar block detection needs to extract the feature set of the file first, and then calculate the similarity between the two files, but the calculation and storage costs are relatively large; while the bloom filter algorithm uses a set to represent the file features, and the calculation and storage costs are smaller than shingle There are many, but the objects to be compared must construct filter values ​​of the same length. For file groups with large file sizes, it is inconvenient to select an appropriate filter length for comparison. If it is too small, the misjudgment rate will be high, and if it is too large then the cost will be high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication method based on internet small computer system interface (iSCSI)
  • Data deduplication method based on internet small computer system interface (iSCSI)
  • Data deduplication method based on internet small computer system interface (iSCSI)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The implementation process of an iSCSI-based de-duplication method in an IP network remote mirroring system of the present invention will be described below with reference to the accompanying drawings.

[0056] The original IP network remote mirroring system consists of front-end client, local mirroring and remote mirroring in the disaster recovery center. The data of the local mirror and the remote mirror are updated synchronously. The two are connected through an IP network, and the transport protocol adopted is iSCSI. In order to implement iSCSI-based deduplication in this system, add a device at the local and remote ends. The structure diagram of the whole system is as follows figure 1 shown. The local device is responsible for intercepting the iSCSI data packets sent by the front end to the remote mirror, and deduplicating the written data in it, and then sending the deduplicated data, that is, the difference data, to the remote device. The remote device is res...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data deduplication method based on internet small computer system interface (iSCSI), belongs to the technical field of computer information storage, and is suitable for an iSCSI-protocol-based internet protocol (IP) network remote mirroring system. In the invention, bandwidth simplification and synchronous time reduction can be realized through deleting repeated data of an iSCSI write data block on the premise of not changing the structure of the conventional IP network remote mirroring system. The data deduplication comprises two stages: in the first stage, a coarse-grained similarity data chunk detection technology is adopted, and a content-defined chunking (CDC) algorithm and a bloom filter algorithm are combined to search similarity chunks in a full range, so that the data deduplication can be more flexible and more accurate; and in the second stage, an improved fine-grained similarity data chunk detection technology is adopted, and a fixed-sized partition and sliding window method is combined, so that the deduplication is performed on chunks rather than files, and transparency of deduplication to users is realized.

Description

technical field [0001] The invention belongs to the technical field of computer information storage, and in particular relates to an iSCSI-based duplicate data deletion method, which is suitable for an IP network remote mirroring system based on the iSCSI protocol. Background technique [0002] IP network remote mirroring system has been widely used in disaster recovery system. The system is based on the iSCSI protocol, and transmits SCSI data and commands to the disaster recovery center through the IP network to achieve the consistency of local mirroring and remote mirroring. The system does not need to build a dedicated network, which greatly reduces the cost of building a disaster recovery system, and also makes the system have good scalability, as long as it can be connected to the IP network, the service can be used. [0003] With the explosive growth of digital information, the scale of data stored in the disaster recovery system is getting larger and larger. Studies...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/08
Inventor 肖达谭乐娟姚文斌王枞陈钊韩司
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products