Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Kudu database data equalization system based on size and implementation method

A technology for balancing systems and databases, applied in the field of databases, to achieve a wide range of applications

Active Publication Date: 2020-05-12
INSPUR SOFTWARE CO LTD
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the cluster is normal, the distribution of the created tables among the Tablet Servers will not change, so as the data is written, the size of each Tablet Server may be unbalanced due to data distribution, which will lead to storage hotspots

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Kudu database data equalization system based on size and implementation method
  • Kudu database data equalization system based on size and implementation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0060] The size-based database data equalization system of kudu of the present invention includes,

[0061] The data balance condition detection module is used to detect whether to perform data balance operation; the working process is as follows:

[0062] (1) Determine whether there is an ongoing migration task:

[0063] a. If there is a task being migrated, skip to step (4);

[0064] b. If there is no ongoing migration task, perform step (2);

[0065] (2) Calculate the difference between the nodes that occupy the largest and smallest disk space in the current situation, and determine whether the difference exceeds the threshold (threshold is the set value, such as 20%): the threshold refers to the largest data skew Value, the maximum value of data skew size is the data difference between the largest node and the smallest node occupying disk space, and can be freely specified according to the specific conditions of the disk.

[0066] a. If the difference does not exceed the ...

Embodiment 2

[0080] as attached figure 1 As shown, the size-based database data balance implementation method of kudu of the present invention, the implementation method steps are as follows:

[0081] S1. The cache acquires the Table being migrated;

[0082] S2. Use the data balance condition detection module to determine whether there is a migration task being executed:

[0083] a. If there is a task being migrated, go to step S10;

[0084] b. If there is no ongoing migration task, execute step S3;

[0085] S3. Calculate the difference between the nodes that occupy the largest and smallest disk space in the current situation, and determine whether the difference exceeds the threshold:

[0086] a. If the difference does not exceed the threshold, jump to step S10;

[0087] b. If the difference exceeds the threshold, execute step S4;

[0088] S4. Obtain the source host with the largest disk usage;

[0089] S5. Obtain the largest Table of the source host, and use the tablet selection mo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a kudu database data equalization system based on size and an implementation method, and relates to the database field, the technical problem to be solved by the invention is how to realize size-based database data equalization of kudu. According to the adopted technical scheme, the system structurally comprises a data equalization condition detection module, a to-be-migrated Tablet selection module and a Tablet migration execution module; the data equalization condition detection module is used for detecting whether data equalization operation is executed or not; ; theto-be-migrated Tablet selection module is used for selecting a to-be-migrated Tablet and a migrated node; and the Tablet migration execution module is used for executing actual data migration. The invention also discloses a kudu database data equalization method based on size.

Description

technical field [0001] The invention relates to the field of databases, in particular to a kudu size-based database data balancing system and an implementation method. Background technique [0002] The Hadoop ecosystem has many components, each with different functions. In real scenarios, users often need to deploy many Hadoop tools at the same time to solve a problem. For example, users need to use Hbase's fast insert and fast read random access features to import data, and users use HDFS / Parquet+Impala / Hive to query and analyze very large data sets. Many companies have successfully deployed the HDFS / Parquet+HBase hybrid architecture. However, this architecture is more complicated, and it is also very difficult to maintain, and it will also cause data delays. Massive structured storage expects to store structured data with a simple architecture, achieve the effect of fast import and fast query of Hbase, analysis of Parquet super large data, and solve the data delay proble...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/21G06F9/50
CPCG06F16/214G06F9/5088Y02D10/00
Inventor 邓光超李朝铭
Owner INSPUR SOFTWARE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products