Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for detecting hive data tables

A data table and data table update technology, applied in the computer field, can solve problems such as lag, wrong results of checking the target table, and inability to monitor feedback information in real time, so as to achieve the effect of solving the detection lag

Active Publication Date: 2022-04-12
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, data accuracy detection is actually a simple logical SQL processing. In MapReduce processing, it is still necessary to calculate how many map numbers will be generated by the SQL, and then perform real calculations. It takes more time to calculate the map number and the efficiency is very low. up
[0013] (2) There is hysteresis
[0014] Now this way of judging the accuracy of data through hive sql is to wait until the execution of the data program is completed, and the judgment can only be made after the result has been inserted into the target table. Real-time monitoring of feedback information during execution
[0015] (3) The error caused by the data file cannot be located
[0016] Sometimes it is found that the inserted data is correct, but the result of checking the target table is wrong
This is often because some wrong data files are stored, resulting in wrong final table results
However, it is impossible to check the problem simply through hive sql.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting hive data tables
  • Method and device for detecting hive data tables
  • Method and device for detecting hive data tables

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0041] In order to enable those skilled in the art to better understand the present invention, some terms are now explained as follows.

[0042] Hadoop: Hadoop is a distributed system infrastructure developed by the Apache Foundation. The core design of the framework is: HDFS and MapReduce. HDFS provides storage for massive data, and MapReduce provides calculation for mas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Embodiments of the present invention provide a method and device for detecting hive data tables, which relate to the field of computer technology and can quickly and accurately detect data uniqueness of hive data tables. The method for detecting the hive data table of the embodiment of the present invention comprises: setting up a configuration file for the hive data table to be tested, the configuration information of the configuration file includes a table name and a primary key; constructing a data file for counting the number of occurrences of the primary key in the hive data table, the data file It is a key-value pair type, with the primary key of the data record in the hive data table as the key, and the number of occurrences of the primary key in the hive data table as the value; when it is detected that a new data record is inserted into the hive data table, update the data file; In the case that the data file has a value greater than 1, a first alarm message for reminding data duplication is issued.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a method and device for detecting a hive data table. Background technique [0002] In the era of big data, data analysis and data application are already a very common thing in society. Data analysis and data application are inseparable from big data development. Now big data development uses Hadoop architecture, and data is stored on the distributed file system HDFS (Hadoop Distributed File System). Daily data development is to convert SQL into MapReduce through hive Or directly use MapReduce for data processing, which is very different from developing on a relational database. Hive uses sql query statement HQL. The biggest difference between HQL and database sql is that database sql supports data update, but HQL does not support it. That is, HQL cannot update (update) or delete (delete) data, and can only use insert ( Insert) realizes update and delete in disguise. This di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F11/07G06F16/242
CPCG06F11/0727G06F11/0751
Inventor 何林艳
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products