Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for detecting hive data table

A data table and data table update technology, applied in the computer field, can solve problems such as lag, wrong results of checking the target table, and inability to monitor feedback information in real time, so as to achieve the effect of solving the detection lag

Active Publication Date: 2018-12-07
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, data accuracy detection is actually a simple logical SQL processing. In MapReduce processing, it is still necessary to calculate how many map numbers will be generated by the SQL, and then perform real calculations. It takes more time to calculate the map number and the efficiency is very low. up
[0013] (2) There is hysteresis
[0014] Now this way of judging the accuracy of data through hive sql is to wait until the execution of the data program is completed, and the judgment can only be made after the result has been inserted into the target table. Real-time monitoring of feedback information during execution
[0015] (3) The error caused by the data file cannot be located
[0016] Sometimes it is found that the inserted data is correct, but the result of checking the target table is wrong
This is often because some wrong data files are stored, resulting in wrong final table results
However, it is impossible to check the problem simply through hive sql.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting hive data table
  • Method and device for detecting hive data table
  • Method and device for detecting hive data table

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0041] In order to enable those skilled in the art to better understand the present invention, some terms are now explained as follows.

[0042] Hadoop: Hadoop is a distributed system infrastructure developed by the Apache Foundation. The core design of the framework is: HDFS and MapReduce. HDFS provides storage for massive data, and MapReduce provides calculation for mas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a method and device for detecting a hive data table, relates to the technical field of computers, and can quickly and accurately perform data uniqueness detection on the hive data table. The method for detecting the hive data table of the embodiment of the invention comprises the following steps: establishing a configuration file for the tested hive data table, wherein configuration information of the configuration file comprises a table name and a primary key; constructing a data file used for counting the occurrence number of the primary key in the hive data table, wherein the data file is the key-value pair type, the primary key of a data record of the hive data table is used as the key, and the occurrence number of the primary key of the hive data table is used as the value; updating the data file when monitoring that a new data record is inserted into the hive data table; and in the case where the data file has a value greater than 1, sending a first alarm message for reminding data duplication.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a method and device for detecting a hive data table. Background technique [0002] In the era of big data, data analysis and data application are already a very common thing in society. Data analysis and data application are inseparable from big data development. Now big data development uses Hadoop architecture, and data is stored on the distributed file system HDFS (Hadoop Distributed File System). Daily data development is to convert SQL into MapReduce through hive Or directly use MapReduce for data processing, which is very different from developing on a relational database. Hive uses sql query statement HQL. The biggest difference between HQL and database sql is that database sql supports data update, but HQL does not support it. That is, HQL cannot update (update) or delete (delete) data, and can only use insert ( Insert) realizes update and delete in disguise. This di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/07G06F17/30
CPCG06F11/0727G06F11/0751
Inventor 何林艳
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products