A data quality detection method and system based on multi-dimensional tags

A data quality and detection method technology, applied in the field of data processing, can solve problems such as poor accuracy and weak timeliness, and achieve the effect of improving quality, improving timeliness, and reducing dirty data

Active Publication Date: 2022-05-31
XIAMEN MEIYA PICO INFORMATION
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of this application is to propose a data quality detection method and system based on multi-dimensional tags to solve the problems of poor accuracy and weak timeliness caused by fixed detection rule templates

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data quality detection method and system based on multi-dimensional tags
  • A data quality detection method and system based on multi-dimensional tags
  • A data quality detection method and system based on multi-dimensional tags

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The application will be described in further detail below in conjunction with the accompanying drawings and examples. Understandably, what is described here

[0046] FIG. 1 shows a flowchart of a method for detecting data quality based on multi-dimensional tags in an embodiment of the present application.

[0056] Step 201: data item type identification. For all kinds of massive data accessed by big data systems, for different types

[0057] Step 202: determine whether they are similar. Determine whether the data item is similar to the data item in the detection rule base, if

[0058] Step 203: Multidimensional label analysis. Label data items of known types with different dimensions, according to the

[0059] Step 204: Perform quality inspection.

[0060] Step 205: Recommendation detection engine. Using the rule similarity evaluation algorithm for data items of unknown types, the unknown

[0061] Step 206: verification of the detection result. Verify that the quality ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application discloses a data quality detection method and system based on multi-dimensional tags. Based on known types of data items and detection rule base, use multi-dimensional label analysis algorithm to mark corresponding dimension labels for known types of data items, and use dimension labels to dynamically adjust the quality inspection process of known types of data items; based on unknown types of data items and Combined with the detection rule base, use the rule similarity evaluation algorithm to recommend a quality detection engine for unknown types of data sources, and verify the results of the quality detection engine to obtain an effective quality detection rule set; save the quality detection process of known types of data items and an effective quality detection rule set and update the multidimensional labeling rule base. This solution solves the problems of poor accuracy and weak timeliness caused by fixed detection rule templates through two algorithms: multi-dimensional labeling algorithm and rule similarity evaluation, realizes fast and accurate detection of data quality and timely feedback of detection results, and improves the quality of data sources.

Description

A method and system for data quality detection based on multi-dimensional labels technical field The application relates to data processing technology field, be specifically related to a kind of data quality detection method based on multidimensional label and system. Background technique [0002] "Big data" requires new processing modes to have stronger decision-making power, insight discovery power and process optimization ability. It is only because of this power that big data becomes a massive, high-growth and diversified information asset. As the big data systems in various places continue to Intermittent access to different industries, generating raw data from a variety of data sources, and reprocessing to form the final information assets. The quality of each data source is the basis for whether the big data system can be effective. Whether there are quality problems in various data sources, timely warning and improving the quality of data sources, reducing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/215G06F16/28
CPCG06F16/215G06F16/283Y02P90/30
Inventor 林文楷周成祖乔赞瑞王海滨吴朝晖齐战胜
Owner XIAMEN MEIYA PICO INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products