Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Mining method, system and database terminal of similar word dictionary rule of database

A database and rule technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as low efficiency, poor performance, and inability to meet the needs of big data analysis and processing

Active Publication Date: 2013-03-13
SHENZHEN AUDAQUE DATA TECH
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] On the one hand, the object of the present invention is to provide a mining method for database approximate dictionary rules, aiming to solve the problems of poor performance and low efficiency of previous mining methods, which cannot meet the needs of big data analysis and processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mining method, system and database terminal of similar word dictionary rule of database
  • Mining method, system and database terminal of similar word dictionary rule of database
  • Mining method, system and database terminal of similar word dictionary rule of database

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0097] In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.

[0098] Related concepts

[0099] Consider the database r, define the set of all columns in r as R, and the different values ​​in each column are called items, and the set of all items is defined as item set I; each row of r is called transaction t (transaction),

[0100] (1) Support: For a given item set Define its support degree supp(X) as the number of transactions containing itemset X in r, that is, satisfy The number of transactions.

[0101] (2) Superset and subset: For two itemsets X and Y, if they meet It is said that Y is a superset of X, and X is a subset of Y, and supp(Y)<=supp(X).

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention is applicable to the field of similar word dictionary rule mining, and provides a mining method, a system and a database terminal of a similar word dictionary rule of a database. The mining method of the similar word dictionary rule comprises the following steps of: scanning and analyzing a database r, eliminating single-value columns and columns in which all values are unique, and the marking the rest candidate column sets as R; counting up the support degree of each item of each column in the candidate column sets R, and numbering the item with the support degree more than the given maximum support degree with an integer; sequentially numbering each row of transactions in the database r, recording the row transaction number included in each item in a list, and then caching; using a method of DCfd to mine the similar word dictionary rule of the database r; and outputting the similar word dictionary rule. According to the invention, the similar word dictionary rule mining method of DCfd is used in the database r, a reverse increment search strategy is used, a search tree is trimmed by a trimming method, and simultaneously, the found rule is cached, so that the calculation amount of the whole mining method can be reduced, and the similar word dictionary rule in the database can be automatically and efficiently found.

Description

Technical field [0001] The invention relates to the field of mining approximate dictionary rules, in particular to a method, system and database terminal for mining database approximate dictionary rules. Background technique [0002] With the rapid development of the Internet and the improvement of informatization in various fields of society, the amount of data is exploding at an unprecedented rate, and human beings are entering the era of big data. The big data era is characterized by larger data volumes, more complex data sources, faster data update speed, and uneven data quality. It is almost impossible to manage data quality by manual means. The field of data management is gestating major changes and breakthroughs. Commercially available technologies basically stay in the manual and experience-based second-generation data quality management stage. The automated third-generation data quality management commercial system based on a rigorous theoretical system is still Did not...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王明兴贾西贝
Owner SHENZHEN AUDAQUE DATA TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products