Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data flow maximal frequent item set mining method based on ordered composite tree structure

A technology of maximum frequent itemset and ordered composite tree, which is applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as low execution efficiency and high memory consumption, achieve good robustness, and improve mining The effect of speed, small memory

Inactive Publication Date: 2015-08-19
ZHEJIANG GONGSHANG UNIVERSITY
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the problems of low execution efficiency and excessive memory consumption of existing maximum frequent itemset mining methods, the present invention proposes a data stream maximum frequent itemset mining method based on an ordered composite tree structure

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data flow maximal frequent item set mining method based on ordered composite tree structure
  • Data flow maximal frequent item set mining method based on ordered composite tree structure
  • Data flow maximal frequent item set mining method based on ordered composite tree structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described in detail in conjunction with the accompanying drawings and specific embodiments.

[0024] The method for mining the maximum frequent itemsets of the data stream based on the ordered compound tree structure proposed by the present invention comprises the following steps:

[0025] 1) Construction of the frequent itemset list: Obtain the information of the data flow segment in the basic sliding window, let ε be the allowable deviation factor, and S be the minimum support. In order to reduce the error, S-ε is taken as the minimum support threshold in actual operation, and the item data sets in the basic window of the single-pass scan are sorted from high to low according to the support (when the support is equal, according to a certain grammatical order sorted, usually lexicographically) an item-set header and a frequent itemset list excluding infrequent items. Among them, the characteristics of the frequent itemset list are...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data flow maximal frequent item set mining method based on an ordered composite tree structure. The data flow maximal frequent item set mining method is suitable for the fields of financial data time sequence mining, commercial data flow association analysis and the like. The invention is specific to the defects of low execution efficiency, excessive memory consumption and the like in an existing maximal frequent item set mining method. Data flow is processed with a sliding window; the sliding window is partitioned into a plurality of basic units; data flow fragment information is updated and acquired; and the fragment information is scanned once to obtain frequent item sets, and the frequent item sets are stored in a frequent item set list. According to the method, an ordered FP-tree is constructed, the structure of the tree is adjusted dynamically along with the insertion of the item sets, adjacent nodes with equal support degrees in a same branch are combined, and an ordered composite FP-tree is generated by means of compression. Through adoption of the method, maximal frequent item set mining can be performed efficiently and rapidly on data flow. The method has a high application value.

Description

technical field [0001] The invention relates to artificial intelligence and data mining technical field knowledge, in particular to a method for mining maximum frequent itemsets of data streams based on an ordered composite tree structure. It is applicable to many fields such as time series mining of financial data and association analysis of commercial data flow. technical background [0002] With the advent of the era of big data, data mining and its related technologies have received more and more attention. Data mining refers to analyzing data sources in a certain way to find some potentially useful information, so data mining is also called knowledge discovery, and association rule mining is a very important topic in data mining. As the name suggests, it is Discover possible associations or connections between things from behind the data. The most classic example is the case of beer and diapers. With the increasing share of large chain retail stores in the retail mar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 陈庭贵许翀寰
Owner ZHEJIANG GONGSHANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products