Gene classification method and device

A classification method and genetic technology, applied in the direction of instruments, computing, biostatistics, etc., can solve the problem of low clustering effect, achieve good clustering effect, strong generalization ability, and strong learning ability

Active Publication Date: 2020-09-22
HENAN NORMAL UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a gene classification method and device for solving the problem of low clustering effect of existing gene classification methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gene classification method and device
  • Gene classification method and device
  • Gene classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0112] Such as figure 1 Shown, a kind of gene classification method of the present invention comprises the following steps:

[0113] Acquire gene expression data, the number of samples contained in the gene expression data is the first set value, the number of genes in each sample is the second set value, and the genes in the gene expression data are arranged and combined to form a matrix, the formed matrix is the gene expression data matrix.

[0114] Using a local linear embedding algorithm to reduce the dimension of the gene expression data matrix, calculate the linear embedding matrix of the gene expression data matrix, and obtain the feature gene subset after dimension reduction. That is, calculate the k nearest neighbors of all samples in the gene expression data matrix, construct a local reconstruction weight matrix according to the k nearest neighbors of each sample, and then use the local reconstruction weight matrix to calculate the gene expression data matrix Linea...

Embodiment 2

[0155] In order to avoid directly using the AP clustering algorithm to cluster the gene expression data set to obtain a large number of clusters, the present invention combines the LLE algorithm with the AP clustering algorithm based on the hybrid kernel function. First, the original high The three-dimensional gene data set is mapped to a low-dimensional space, and the characteristic gene subsets are obtained through linear dimension reduction; then the characteristic gene subsets after dimensionality reduction are clustered using the AP clustering algorithm based on the hybrid kernel function, and the final clustering is obtained result.

[0156] Such as figure 2 As shown, the specific steps are as follows:

[0157] Data preprocessing: use the genetic data acquisition system to obtain the original gene expression data set, including the gene expression values ​​of multiple samples and the gene expression data matrix of the sample class label. The description of the gene exp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a gene classification method and device, which combines the LLE algorithm and the AP clustering algorithm, and uses the proposed mixed kernel function to improve the similarity measurement function. First, the LLE algorithm is used to map the original high-dimensional gene expression data set to a low-dimensional space to achieve the purpose of dimensionality reduction; secondly, a new global kernel function is proposed as the F-type kernel function, and it is linearly combined with the Gaussian kernel function to form a new The hybrid kernel function, and use the proposed hybrid kernel function to calculate the similarity measure, construct a new similarity matrix S; then cluster the data through the AP clustering algorithm and the similarity matrix, iteratively obtain the final clustering result; finally through Compared with other clustering methods, the validity and accuracy of the algorithm of the present invention are verified.

Description

technical field [0001] The invention belongs to the technical field of gene classification, and in particular relates to a gene classification method and device. Background technique [0002] As the amount of genetic information continues to increase, how to process genetic data to obtain useful information has become a difficult problem. Data sets usually contain a large number of irrelevant genes, redundant genes, etc. Therefore, how to analyze and obtain an effective subset of characteristic genes from the massive information base, so as to better select disease-causing genes has become an important research topic for experts and scholars. [0003] As an effective data analysis method, cluster analysis is widely used in data mining, machine learning and pattern recognition, bioinformatics and other fields. Cluster analysis is mainly to cluster high-dimensional data into different clusters, so that the distance within the class is as small as possible and the distance bet...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B40/00
CPCG16B25/00G16B40/00G06F18/23
Inventor 孙林刘弱南张霄雨孟新超常宝方孟玲玲王蓝莹陈岁岁殷腾宇李源
Owner HENAN NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products