Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method, device and equipment of data warehouse-in

A database and data technology, applied in the field of data warehousing, can solve the problems of fixed performance, high IO pressure affecting data warehousing performance, and warehousing performance, etc., to achieve the effect of reducing IO pressure, ensuring performance, and reducing quantity

Active Publication Date: 2018-10-09
HANDAN BRANCH OF CHINA MOBILE GRP HEBEI COMPANYLIMITED +1
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the effects of scheme 1 and scheme 2 are related to Hbase's own product version and the number of Handlers, and the performance that can be improved is relatively fixed.
The commonly used Hbase data compression algorithms mentioned in Scheme 3 are often used in actual production projects, but these three data compression algorithms are applicable to different application scenarios, and the algorithms themselves have some different limitations.
When using any compression algorithm in GZIP, LZO, Zippy / Snappy for engineering practice, you will inevitably encounter the problem of high IO pressure on the network and disk, which will lead to the problem that the storage performance is affected
[0010] To sum up, the following technical problems exist in the prior art: due to the high IO pressure of the network and disk, the performance of data storage is affected

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device and equipment of data warehouse-in
  • Method, device and equipment of data warehouse-in
  • Method, device and equipment of data warehouse-in

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] In order to make the object, technical solution and advantages of the present invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0070] In the embodiment of the present invention, firstly, a high-speed Huffman compression algorithm including a quick sort algorithm is started to perform the first data compression on the written data to generate the first compressed file. Then, call the double compression algorithm to perform the second data compression on the first compressed file, and realize HDFS data storage based on the second compressed file. Considering the CPU performance pressure and compression efficiency, the written data is compressed twice to reduce the amount of stored data, thereby reducing the IO pressure of the network and disk, and ensuring the performance of data storage.

[0071] see figure 1 It is a schematic flow chart of the method for dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method, a device and equipment of data warehouse-in. The method includes: starting a high-speed Huffman compression algorithm, which includes a quick sorting algorithm, to carry out first data compression on written data to generate firstly compressed file; after the firstly compressed file is written into an Hbase database table, calling a double compression algorithm torealize second data compression on the firstly compressed file, and outputting a secondly compressed file; and realizing data warehouse-in of a Hadoop distributed file system (HDFS) on the basis of the secondly compressed file. After employing the embodiment of the invention, IO pressure of a network and a disk can be alleviated, and data warehouse-in performance can be guaranteed.

Description

technical field [0001] The invention relates to the field of computers, in particular to a method, device and equipment for data storage. Background technique [0002] According to the research results of the Internet Data Center (IDC) for many years, the global data volume doubles approximately every 2 years, and the annual data volume increases exponentially. The data growth rate conforms to Moore's law. It is estimated that by 2020, the global total data volume will be Reach 35ZB. How to effectively collect, load, analyze and process these massive data has become an important link and basis for big data applications. [0003] In order to realize the rapid processing of massive data, an important prerequisite is to realize the rapid storage of massive data. However, as the volume of data becomes larger and larger, the pressure on the IO performance of the network and disk in the process of data storage increases sharply. In the existing network And under the performance ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30H03M7/40
CPCH03M7/4012
Inventor 张琳冯明
Owner HANDAN BRANCH OF CHINA MOBILE GRP HEBEI COMPANYLIMITED
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products