A method and system for extracting multi-object label data

A tag data and multi-object technology, applied in the field of multi-object tag data extraction, can solve the problems of time-consuming and resource-consuming, and achieve the effects of low latency, efficient storage, and shortened execution time

Active Publication Date: 2022-04-05
北京华品博睿网络技术有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the prior art, when storing tag data of multiple objects, all tags are directly stored in a large wide table, which will consume a lot of time during ETL. For the storage of underlying data, a common technical solution is Apache Hive is used for storage, but all data in the Hive table needs to be updated when updating the tag value, because Hive does not support data update by row and can only be covered by full data, so in the case of a large amount of data Next, it consumes more resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for extracting multi-object label data
  • A method and system for extracting multi-object label data
  • A method and system for extracting multi-object label data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0042] It should be noted that if there is a directional indication (such as up, down, left, right, front, back...) in the embodiment of the present invention, the directional indication is only used to explain the position in a certain posture (as shown in the accompanying drawing). If the specific posture changes, the directional indication will also change accordingly.

[0043]In addition, in describing the present invention, the terms used are for the purpose o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the present invention discloses a method for extracting label data of multiple objects, including: extracting metadata of label data of multiple objects based on extraction rules, generating a temporary table for data extracted by each extraction rule; Perform format conversion on the temporary table, and merge the results of multiple temporary tables after format conversion into a large wide table; route the value of each label in the large wide table based on the pre-built tag tree structure To different sub-tables, the sub-tables are stored in the timestamp partition, and the timestamp partition is a partition whose value is the update time of the tag value. The embodiment of the invention also discloses a multi-object tag data extraction system. The invention can realize the efficient storage problem of a large amount of label data of multiple objects, and can realize the low-latency dynamic update of the label data.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a method and system for extracting multi-object label data. Background technique [0002] The tag data of objects is important data of the user portrait system, which comes from different data sources. In practical applications, it is necessary to integrate tag data of multiple objects for processing such as query and analysis, which requires storage of a large amount of tag data. In the prior art, when storing tag data of multiple objects, all tags are directly stored in a large wide table, which will consume a lot of time during ETL. For the storage of underlying data, a common technical solution is Apache Hive is used for storage, but all data in the Hive table needs to be updated when updating the tag value, because Hive does not support data update by row and can only be covered by full data, so in the case of a large amount of data Next, it consumes more r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/22G06F16/23G06F16/25G06F16/28
CPCG06F16/2282G06F16/258G06F16/284G06F16/23
Inventor 黄景景徐文朝朱辉张涛薛延波赵鹏
Owner 北京华品博睿网络技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products