Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Rapid encapsulation type gene selection method based on maximum correlation and minimum redundancy

A technology of gene selection and maximum correlation, applied in the fields of instrumentation, hybridization, biostatistics, etc., can solve the problems of affecting the time performance of the feature selection process, not well balanced classification accuracy, not considering the number of features, etc., to reduce the evaluation. Number of gene subsets, effect of reducing size, improving speed

Inactive Publication Date: 2019-12-03
HEFEI UNIV OF TECH
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] (1) In the process of gene selection, the classification accuracy of the classifier needs to be used as an index to evaluate the pros and cons of each gene subset, so a large number of classifier evaluation operations need to be performed, that is, each time a new gene subset is generated, it needs to be executed The training and testing process of the classifier, thus seriously affecting the time performance of the feature selection process;
[0006] (2) The relationship between the number of selected feature subsets and the classification accuracy of the feature subsets cannot be well balanced. For example, for two feature subsets with similar classification performance, the one with higher classification accuracy will be preferred, while The number of features of the two feature subsets is not considered, so the feature subset with a slightly lower classification performance but a smaller number of features will be discarded

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rapid encapsulation type gene selection method based on maximum correlation and minimum redundancy
  • Rapid encapsulation type gene selection method based on maximum correlation and minimum redundancy
  • Rapid encapsulation type gene selection method based on maximum correlation and minimum redundancy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] In this embodiment, a fast encapsulation gene selection method based on maximum correlation and minimum redundancy is applied to the data set Data composed of n microarray gene data, which is recorded as Data={inst 1 ,inst 2 ,...,inst i ,...,inst n};inst i represents the i-th microarray gene data; and Indicates the i-th microarray gene data inst i The jth gene in C; i Indicates the i-th microarray gene data inst i The categorical variables of , such as abnormal / normal; the jth gene vector of the data set Data is composed of the jth gene of n microarray gene data, denoted as The gene vector group of data set Data is composed of m gene vectors, denoted as F={f 1 ,f 2 ,..., f j ,...,f m}; The categorical variables of n microarray gene data form a categorical vector, denoted as C={C 1 ,C 2 ,...,C i ,...,C n}; so as to obtain a gene vector data set composed of m gene vectors and a category vector, denoted as D={f 1 ,f 2 ,..., f j ,..., f m ,C}; 1≤i≤n; 1...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a rapid encapsulation type gene selection method based on maximum correlation and minimum redundancy, and the method comprises the following steps: 1, searching a gene with themaximum correlation degree with a category label in a gene vector group by using a correlation method, and adding the gene into a candidate gene subset; 2, searching a gene with the maximum correlation redundancy in the gene vector group by utilizing a maximum correlation minimum redundancy method, and adding the gene into the candidate gene subset; 3, judging whether the classification precisionof the two candidate gene subsets before and after updating the candidate gene subsets is reduced or not by utilizing a ten-fold cross validation method; 4, if the classification precision is reduced, outputting the candidate gene subset before updating, otherwise, repeating the step 2. According to the method, the high-quality gene subsets can be obtained, and meanwhile, the time complexity of acommon packaging method is remarkably reduced, so that the gene subsets have good time performance in obtaining, and the obtained gene subsets have good classification performance.

Description

technical field [0001] The invention belongs to the field of data mining, in particular to a fast encapsulation gene selection method based on maximum correlation and minimum redundancy. Background technique [0002] As a data dimensionality reduction technique, gene selection is widely used in the analysis of genetic data and the prediction of genetic diseases. High-dimensional genetic data may contain redundant and irrelevant genes, so if all genes are used in the training and prediction of the classifier, it will often lead to poor performance of the classifier, mainly in two aspects: classification performance and time performance. An effective gene selection method can not only reduce the dimensionality of the original gene space, but also improve the generalization performance of the classifier while improving the classification performance and time performance. In addition, the gene selection method can also help researchers find a set of genes that are highly relat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B25/10G16B40/00
CPCG16B25/10G16B40/00
Inventor 杨静沈安波方宝富王浩
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products