A parallel distributed big data architecture construction method and system

A construction method and big data technology, applied in the construction method and system field of parallel distributed big data architecture, can solve problems such as complex, unsuitable for large-scale data query, difficult to guarantee data consistency, etc., to ensure load balance and improve data The effect of query ability

Active Publication Date: 2022-07-12
南京翌淼信息科技有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Common parallel distributed architectures include HDFS system, HBase system, MapReduce distributed computing framework, etc.; HDFS system has the advantages of high fault tolerance and high scalability, but it is difficult to guarantee data consistency; although HBase system can support massive data writing input, but not suitable for large-scale data query; the MapReduce distributed computing framework can develop parallel and distributed applications without knowing the underlying details of the distributed system, and reuse large-scale computing resources, but it is too low-level. A simple query requires writing Map and reduce functions, which is complex and time-consuming

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A parallel distributed big data architecture construction method and system
  • A parallel distributed big data architecture construction method and system
  • A parallel distributed big data architecture construction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] refer to Figures 1 to 2 , which is the first embodiment of the present invention, which provides a method for constructing a parallel distributed big data architecture, including:

[0031] S1: The grid is established by the grid unit 100, and the data is sequentially stored in the grid according to the time stamp.

[0032] (1) Create a grid.

[0033] Define a total of n*m data, and calculate the average density ρ of each layer of data nodes;

[0034] According to the average density ρ, the data area of ​​each layer is divided into grids, and the density ρ of the data nodes in the grid is judged i is close to the mean density ρ, define |ρ i -ρ|≥0.01 is not close, if not, the grid is divided according to the area of ​​the data node;

[0035] If it is close, continue to divide the data area of ​​each layer according to the average density ρ.

[0036] Among them, n is the number of data layers, m is the number of nodes in each layer of data, ρ i Represents the i-th d...

Embodiment 2

[0091] Different from the first embodiment, this embodiment provides a parallel distributed big data architecture construction system, including:

[0092] The grid unit 100 is used to establish a grid, and store the data into the grid in sequence according to the time stamp;

[0093] The computing unit 200, connected with the grid unit 100, is used for computing grid data correlation and data node sampling time interval, wherein the grid data correlation includes grid spatial correlation C N , data acquisition time correlation D T Correlation with data collection location D L .

[0094] The transmission unit 300 is connected to the grid unit 100 and the computing unit 200 respectively, and is used for allocating grid data to the data storage unit 400;

[0095] The data storage unit 400 is used for storing grid data; the data storage unit 400 includes a RAC database and a Teradata database.

[0096] It should be appreciated that embodiments of the present invention may be i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for constructing a parallel distributed big data architecture, wherein a method for constructing a parallel distributed big data architecture comprises: establishing a grid through grid units, and sequentially storing data into the grid according to time stamps; The grid data correlation and the data node sampling time interval are calculated by the computing unit; the grid data is distributed to the data storage unit through the transmission unit according to the grid data correlation; Planning transmission paths and allocating data storage space ensures the load balance of data nodes and greatly improves data query capabilities.

Description

technical field [0001] The present invention relates to the technical field of parallel data processing, in particular to a method and system for constructing a parallel distributed big data architecture. Background technique [0002] Big data is a data population composed of data distributed and stored in the disk space of multiple single nodes in a cluster node and can be processed in a distributed manner. The scale of big data can continue to expand as the number of nodes continues to increase. [0003] In contemporary times, we are already surrounded by massive amounts of data, and extracting valuable information from this data requires a distributed infrastructure that provides transparency of the underlying details. Common parallel distributed architectures include HDFS system, HBase system, MapReduce distributed computing framework, etc. HDFS system has the advantages of high fault tolerance and high scalability, but it is difficult to ensure data consistency; althou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L67/1097H04L67/568H04L67/1001G06F16/27
CPCH04L67/1097G06F16/27H04L67/1001H04L67/568Y02D10/00
Inventor 张蒙蒙赵祥柯静潘丽君
Owner 南京翌淼信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products