Application-aware data routing method and system for large-scale cluster deduplication

An application-aware, data-routing technology that applies redundancy in computing to data error detection, transmission systems, and electrical digital data processing. It can solve the problems of high communication overhead, unbalanced system load, and low data deduplication and other issues to achieve the effects of low communication overhead, reduced data overlap, and high data deduplication rate

Active Publication Date: 2017-02-22
PLA UNIV OF SCI & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In short, the problems existing in the existing technology are: deduplication for clusters with hundreds of nodes in the data center, there are defects such as low data deduplication rate, low node throughput rate, large system communication overhead and unbalanced system load.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Application-aware data routing method and system for large-scale cluster deduplication
  • Application-aware data routing method and system for large-scale cluster deduplication
  • Application-aware data routing method and system for large-scale cluster deduplication

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] Such as figure 1 As shown, the large-scale backup storage cluster system of the present invention includes multiple backup clients 100, a backup server 200 and multiple deduplication storage servers 300;

[0033] The backup client 100 is used to send a file backup request message containing file meta information such as the name, user and size of the file to the backup server 200; Each file sent to the corresponding routing target deduplication storage server 300 node;

[0034] Each backup client 100 includes a file I / O module 101 and a backup request module 102, the backup request module 102 is used to perform a file backup session with the backup server 200, and the file I / O module 101 is used to According to the file routing decision result returned by the backup server 200, each file is backed up to the corresponding deduplication storage server 300;

[0035] The backup server 200 is used to perceive the application type of the backup file according to the file me...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an application perception data routing method oriented to large-scale cluster deduplication and a large-scale backup storage cluster system. The application perception data routing method comprises the steps of (S10) obtaining backup file meta-information, (S20) sensing a file application type, (S30) calculating deduplication storage node loads, (S40) selecting file routing nodes, (S50) sending files to target nodes, (S60) conducting deduplication on the files in the nodes, and the like. The large-scale backup storage cluster system comprises a plurality of backup clients, a backup server and a plurality of deduplication storage servers. The data routing method and the system have the advantages that the data deduplication rate is high, the node throughput rate is high, system communication overheads are low, and system loads are balanced.

Description

technical field [0001] The invention belongs to the technical field of information storage and cluster computing, in particular to an application-aware data routing method for large-scale cluster deduplication and a large-scale backup storage cluster system. Background technique [0002] In many backup storage systems that manage massive data, the data is highly redundant. Cluster Deduplication technology is to realize distributed and parallel data deduplication processing on the backup storage server cluster system, which can meet the capacity and performance of massive backup data management. scalability needs. In order to build an energy-saving, environmentally friendly, and efficient green data center, cluster deduplication has become the core technology of current data center storage management. [0003] In consideration of system overhead, cluster deduplication often chooses a loosely coupled design and does not perform cross-node data deduplication. The data sent by...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F11/14H04L29/08
CPCG06F11/1453G06F16/174H04L45/44
Inventor 付印金胡谷雨倪桂强谢钧
Owner PLA UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products