Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data distribution method on basis of heterogeneous cluster

A technology for data distribution and heterogeneous clusters, applied in the field of distributed computing, which can solve problems such as slowing down the process and deteriorating the network environment.

Active Publication Date: 2014-04-02
TSINGHUA UNIV
View PDF2 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It is the movement of these data that causes a lot of network overhead and worsens the network environment. At the same time, for slow nodes, because they also need to send data to fast nodes, this slows down the process of executing local tasks even more.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data distribution method on basis of heterogeneous cluster
  • Data distribution method on basis of heterogeneous cluster
  • Data distribution method on basis of heterogeneous cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0057] Please refer to Figure 1-4 and Figure 7-9 , this embodiment provides a data distribution method based on heterogeneous clusters, which is applied to multiple communication device nodes. In this embodiment, the device nodes include master control nodes and slave control nodes. Generally, the master control There is only one node, and other device nodes are all slave control nodes.

[0058] see figure 1 , firstly, each device node regularly transmits its own data block reading information to the database of the master control node. In this embodiment, the data block reading information includes three types: data blocks calculated locally when executing local tasks, data blocks read by other device nodes, and data blocks read in from other device nodes.

[0059] The method starts at step 101, the master control node judges whether the distribution period of the task is reached, and if it arrives, reads the data block read information of each device node from its datab...

Embodiment 2

[0091] see Figure 5-Figure 11 , this embodiment provides a data distribution method based on a computing capability-aware data distribution file system AAOC in a heterogeneous cluster. In this embodiment, the AAOC distributed file system is based on the MapReduce computing model and includes multiple device nodes, the multiple device nodes include a metadata server and several data servers, and each data server is both a computing node and a storage node. node. Among them, the metadata server is the main control node, and other data servers are the slave control nodes.

[0092] see Figure 5 , the method starts at step 501, each data server (including a metadata server and several data servers) runs a monitoring daemon program, and regularly submits its own data block read data to the database of the metadata server, wherein, The data block reading information is divided into data blocks calculated locally when executing local tasks, data blocks read by other device nodes,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data distribution method on the basis of a heterogeneous cluster, which is applied to a plurality of communicated equipment nodes. The method comprises the following steps of reading data block read information of each equipment node and determining required task data, wherein the data block read information comprises data blocks locally calculated when a local task is executed, data blocks read out by other equipment nodes and data blocks read in from the other equipment nodes; according to the read data block read information, predicting calculating ability of each equipment node; according to a predicting result, distributing the determined task data into the local task of each equipment node. According to the invention, by utilizing distribution of the task data on a bottom layer to guide dispatching of tasks on an upper layer and reasonably distributing the data, the calculating ability of each equipment node is matched with the distributed data; moreover, according to the invention, the determined task data is distributed into the local task of each equipment node, so that the problems of network expenses and aggravated network resource scramble which are caused by remote tasks and data movement are solved.

Description

technical field [0001] The invention relates to the technical field of distributed computing, in particular to a data distribution method based on heterogeneous clusters. Background technique [0002] Some recent research reports show that the next generation of data centers will have great heterogeneity due to reasons such as energy saving, unit price performance, and different performance prices. In data centers, in addition to these reasons, there is another very important Reasons for heterogeneity: A data center usually adds some new servers every once in a while to expand its computing capacity. Generally speaking, servers added one or two years apart will have several generations of hardware differences. At the same time, the data center is usually shared by multiple tasks, and the mutual influence between different tasks running at the same time will also cause the performance difference of each node. [0003] In the existing technology of heterogeneous clusters, in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L29/08
Inventor 杨广文王博姜进磊
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products