Top-k adaptive comparison mode mining method based on incomplete network tree

A top-k, pattern mining technology, applied in neural learning methods, character and pattern recognition, biological neural network models, etc.

Inactive Publication Date: 2020-11-13
HEBEI UNIV OF TECH
View PDF14 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0019] The technical problem to be solved by the present invention is to provide an adaptive comparison mode mining method, especially to solve the non-overlapping comparison mode mining problem under the adaptive gap by using the incomplete network tree structure. In the case of constraints and support thresholds, adaptive contrast pattern mining is realized, which overcomes the shortcomings of the prior art for mining problems with adaptive non-overlapping contrast patterns, while taking into account the flexibility, completeness and efficiency of the solution. Helps improve classification accuracy and interpretability of classification models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Top-k adaptive comparison mode mining method based on incomplete network tree
  • Top-k adaptive comparison mode mining method based on incomplete network tree
  • Top-k adaptive comparison mode mining method based on incomplete network tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0119] Given a binary sequence database D={s 1 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 =tggtggt,s 2 =s 1 the s 2 the s 3 =tgt,s 3 =s 1 the s 2 the s 3 the s 4 =tgtt,s 4 =s 1 the s 2 =gt,s 5 =s 1 the s 2 the s 3 the s 4 the s 5 =ggcct,s 6 =s 1 the s 2 the s 3 =gat}, a given density threshold ρ τ =0.2, the expected number of comparison patterns k=3.

[0120] The first step is to read into the sequence database D, the density threshold ρ τ And the expected number of comparison patterns k:

[0121] Read into a given sequence database D, and determine that the total number of sequences contained in it is N, and the number of positive sequences is N + , the number of negative sequences is N - , each sequence in the sequence database D is recorded as sequence s 1 , sequence s 2 , ..., sequence s, ..., sequence s N , where 1≤≤N, the characters contained in the sequence s are respectively recorded as characters s 1 , character s 2 , ......

Embodiment 2

[0217] Given a binary sequence database D={s 1 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 the s 8 =hihiohig,s 2 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 the s 8 the s 9 the s 10 = googgoigoh}, given the density threshold ρ τ =0.2, the expected number of comparison patterns k=3, mining top-k adaptive comparison patterns under one-time conditions.

[0218] In addition to "step (3.2.1.1), create an incomplete network tree of mode q Calculate the support of the mode q in the sequence s sup(q, s) to judge whether it is necessary to create a sub-mode p j node in queue j

[0219] ①If s i =p j and then node As a root node, it can be created directly;

[0220] ②Hypothetical node is the last node of layer (j-1) and has no children, if j>1 and u can be used as The child node of is created; "except, the others are the same as embodiment 1.

[0221] Because when creating an incomplete network tree of mode q under one-time conditio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a top-k adaptive comparison mode mining method based on an incomplete network tree, can be used for feature extraction of multi-class sequences by mining a top-k adaptive comparison mode, and belongs to the field of sequence mode analysis of data mining. According to the method, the problem of support degree calculation of a non-overlapping adaptive comparison mode is solved by utilizing an incomplete network tree structure, and the super-mode support degree is solved through the support degree of the sub-modes, and therefore redundant calculation is prevented; a contrast priority mining strategy, a Zero pruning strategy and a Less pruning strategy are adopted to reduce generation of candidate modes, and time complexity and space complexity are reduced. According tothe method, a user does not need to give a minimum support degree threshold value and a gap constraint, the self-adaptive comparison mode mining is realized, the setting of the minimum support degreethreshold value and the gap constraint is prevented, the problem that the mining efficiency and the result completeness are difficult to consider in the comparison mode mining process in the prior art is solved, and the sequence classification precision and the interpretability of the classification model are favorably improved.

Description

technical field [0001] The technical scheme of the invention relates to the field of sequential pattern analysis, in particular to a top-k self-adaptive comparison pattern mining method based on an incomplete network tree. Background technique [0002] With the advent of the era of big data, a large amount of sequence data has emerged in many fields, such as traffic travel data, patient monitoring data, equipment operation monitoring data, and various time series data. These data are multi-category and multi-dimensional, how to quickly extract or mine valuable information from these data has become a current research hotspot. Sequential pattern mining has been widely used as an effective means of information extraction, but if frequent pattern mining is performed directly on data information, the category characteristics of the data will be ignored, which is not conducive to discovering valuable information. Contrastive pattern mining aims to discover information difference...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458G06K9/62G06N3/08
CPCG06F16/2465G06N3/082G06F18/22G06F18/241
Inventor 王月华李艳陈明婕赵晓倩刘锦王珠林武优西
Owner HEBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products