Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Load balancing and computation localization method of iterative backtracking algorithm based on HDFS (Hadoop distributed file system)

A backtracking algorithm and load balancing technology, applied in computing, structured data retrieval, special data processing applications, etc., can solve problems affecting query or storage data performance, etc.

Active Publication Date: 2015-02-04
北京东方国信科技股份有限公司
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of the Map Reduce model is that there are some unnecessary remote reads in the way of assigning tasks, which affects the performance of querying or storing data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Load balancing and computation localization method of iterative backtracking algorithm based on HDFS (Hadoop distributed file system)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0028] Computational localization is to send calculations to the data storage nodes for execution, and try to avoid data transmission to other nodes for calculations through the network, saving bandwidth usage. The Planner in the data analysis engine of the present invention, that is, the execution plan generator, selects which nodes to scan and which data packets to scan according to the analysis information of the data packets and the load status of each node.

[0029] In order to better understand and apply the load balancing and computing localization method based on the HDFS iterative backtracking algorithm proposed by the present invention, the following figures are used as examples to illustrate in detail.

[0030] Such as figure 1 As shown, the present invention provides a load balancing and calculation localization method based on the iterative...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a load balancing and computation localization method of an iterative backtracking algorithm based on an HDFS (Hadoop distributed file system). The method comprises the steps of: S1, reading IP addresses and load states of all survival nodes in all data analysis engine systems by Planner; S2, reading distribution information of all data packages of a table from name nodes by the Planner; S3, realizing the load balancing by using the iterative backtracking algorithm according to the IP addresses and load states of all survival nodes in all data analysis engine systems read by the Planner and the distribution information of all data packages of the table from name nodes. According to the generated Planner, on the basis of guaranteeing the computation localization as far as possible, the method can rapidly and efficiently realize load balancing.

Description

technical field [0001] The invention relates to the technical field of computer distributed databases, in particular to a load balancing and computing localization method based on an HDFS-based iterative backtracking algorithm. Background technique [0002] At present, most data analysis engines realize load balancing by moving HDFS files, that is, changing the physical location of HDFS files, for example, moving from Data Node A (data node A) to Data Node B (data node B), the present invention No one has found any relevant research on implementing load balancing while data is running. The disadvantage of the Map Reduce model is that there are some unnecessary remote reads in the way of assigning tasks, which affects the performance of querying or storing data. Contents of the invention [0003] The technical problem to be solved by the present invention is how to send the calculation of the large table to the data storage node for execution, avoid data transmission to ot...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30H04L29/08
CPCG06F16/25H04L67/1001
Inventor 刘垚孔令雷王小玉霍卫平金正皓
Owner 北京东方国信科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products