Interaction feature selection method based on neighborhood condition mutual information

A feature selection method, conditional mutual information technology, applied in the field of data mining

Inactive Publication Date: 2021-05-28
SOUTHWEST JIAOTONG UNIV
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the above deficiencies in the prior art, the purpose of the present invention is to re-characterize the interaction between features for mixed data containing noise and uncertainty, and perform interactive analysis on the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Interaction feature selection method based on neighborhood condition mutual information
  • Interaction feature selection method based on neighborhood condition mutual information
  • Interaction feature selection method based on neighborhood condition mutual information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0056] figure 2 It can be seen that the present invention and figure 1 The difference in the overall processing flow of the represented prior art.

[0057] image 3 A computational framework for going from the original feature set to the reduced feature subset using an interactive feature selection algorithm based on neighborhood conditional mutual information is shown. First, for different data types, the HCOM distance function is used to determine the neighborhood relationship of each feature, and the neighborhood similarity matrix of each feature is calculated according to the multi-neighborhood radius set. Secondly, the neighborhood information theory is used to explore the correlation between features, including the correlation between features and classes, the redundancy and interaction between features. Based on this definition of relevance, a fea...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an interactive feature selection method based on neighborhood condition mutual information. The method comprises the steps: firstly, determining the neighborhood relation of each feature through employing an HCOM distance function for different data types, and calculating a neighborhood similarity relation matrix of each feature according to a multi-neighborhood radius set; secondly, exploring relevance between the features by utilizing neighborhood information, wherein the relevance comprises relevance between the features and classes and redundancy and interactivity between the features, and based on the relevance, establishing an evaluation function of feature importance of maximum relevance, minimum redundancy and maximum interactivity (MRmRMI). scoring the importance of the features through the evaluation function to obtain an ordered feature sequence with classification contributions from large to small; and finally, selecting a final reduction feature subset through testing on different classifiers, wherein the feature subset is a feature subset sequence corresponding to the optimal average classification performance. Compared with other six popular feature selection algorithms, the method of the invention has high classification performance and a more significant classification effect.

Description

technical field [0001] The invention belongs to the technical field of data mining, and is a feature selection method for mixed data containing noise and uncertainty. The method comprehensively considers the correlation between features and classes, and the redundancy and interaction between features. Background technique [0002] In recent years, the development of big data applications has put forward higher requirements for the understanding and processing of high-dimensional data. In particular, large datasets with noisy, irrelevant or redundant features pose great challenges for data mining, knowledge discovery and pattern recognition. Due to the existence of the curse of dimensionality, how to select the optimal feature subset from all features is considered to be a worthy research topic in various learning tasks. In response to this problem, many feature selection methods have been proposed, which are dedicated to removing irrelevant features and eliminating redundan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/28G06N5/00
CPCG06N5/00G06F16/285
Inventor 陈红梅万继红李天瑞罗川胡节
Owner SOUTHWEST JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products