Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Characteristic selection method based on information measurement

A feature selection method and mutual information technology, which can be used in instruments, character and pattern recognition, computer parts, etc., and can solve problems such as unsatisfactory feature selection effects.

Inactive Publication Date: 2016-11-30
TIANJIN UNIV
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the algorithm in the second case has the problem that the mutual information cannot be effectively combined with the three-way interactive information, which makes the effect of feature selection unsatisfactory, the feature selection of the second case is studied.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Characteristic selection method based on information measurement
  • Characteristic selection method based on information measurement
  • Characteristic selection method based on information measurement

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] The present invention first normalizes the symmetrical uncertainty (Symetrical Uncertainty, SU) of the mutual information of the class label and the feature; trade-off; propose a feature selection algorithm based on information metrics, and verify whether there is a trade-off coefficient that is generally optimal for some data set performance through experimental results.

[0070] The invention proposes a feature selection algorithm based on information measurement by using the two quantities of the class label and the SU value of the feature and the three-way interaction information between the class label and two features. The specific technical scheme is described in detail as follows:

[0071] 1.1 Background knowledge of information measurement

[0072] For the convenience of expression, only discrete random variables are dealt with. Suppose X is a discrete random variable, and p(x) is the probability density function of the variable. Information entropy is often...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of machine learning and data excavation and provides a characteristic selection method based on information measurement. Whether a weighing coefficient universally optimal to the performance of a plurality of data sets exists is verified by experiment results. According to the technical scheme, the characteristic selection method based on information measurement comprises the following steps of: utilizing SU (fi; c) of a characteristic fi and a class label c, and three-path interaction information I (fi; ft; c) of two characteristics fi and ft and the class label c to construct a target function shown in the specification, in the above formula, fi is an unselected characteristic, X is an unselected characteristic set, c is the class label, D is a fs characteristic set which meets the requirement that the maximum value of I (fi; ft; c) is larger than zero, fs is a newly selected characteristic, ft is a characteristic of the D subset, and [beta] is a weighing coefficient. The characteristic selection method is mainly applied to machine learning and data excavation.

Description

technical field [0001] The invention belongs to the technical fields of machine learning and data mining, and relates to a feature selection method based on information measurement. Background technique [0002] As an important way of dimension reduction, feature selection is to select a better subset from the original features as the final feature according to certain metrics, thereby reducing the feature dimension. According to the relationship between feature subset metrics and learning algorithms, feature selection algorithms can be divided into Filter, Embedded and Wrapper. Compared with the three, the feature selection effect of the embedded algorithm and the packaging algorithm is good, but it takes more time; the feature selection effect of the filter algorithm is relatively poor, but it takes less time, and it is more suitable for application to high-dimensional data sets. According to different metrics, filtering algorithms can be divided into algorithms based on ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/24155
Inventor 郭继昌顾翔元李重仪
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products