Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Double clustering method for tumor gene expression profile data based on double hypergraph regularization

A tumor gene and expression profile technology, applied in the field of double clustering of tumor gene expression profile data, can solve the problem that the principal component analysis method cannot mine the inherent geometric structure of the data, and achieve the effect of improving accuracy

Active Publication Date: 2022-02-08
CHINA UNIV OF MINING & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this case, principal component analysis methods cannot mine the inherent geometric structure of real-world data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Double clustering method for tumor gene expression profile data based on double hypergraph regularization
  • Double clustering method for tumor gene expression profile data based on double hypergraph regularization
  • Double clustering method for tumor gene expression profile data based on double hypergraph regularization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] The present invention will be further described below.

[0060] Concrete steps of the present invention are:

[0061] Step Ⅰ: Decompose the tumor gene expression profile data into a gene clustering matrix and a sample clustering matrix by principal component analysis;

[0062] Step II: Construct a sample hypergraph based on samples of tumor gene expression profile data;

[0063] Step Ⅲ: Construct a gene hypergraph according to the genes of the tumor gene expression profile data;

[0064] Step Ⅳ: Using the sample hypergraph and the gene hypergraph as the sample hypergraph regularization item and the gene hypergraph regularization term respectively as principal component analysis, determine the form of the optimization objective function;

[0065] Step V: optimize the sample clustering matrix and gene clustering matrix in step I by optimizing the objective function, and obtain the optimized sample clustering matrix and gene clustering matrix;

[0066] Step VII: realize...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a double-clustering method for tumor gene expression profile data based on double hypergraph regularization, by clustering the samples and genes of the tumor gene expression profile data respectively; then, the samples and genes of the tumor gene expression profile data Genes respectively establish sample hypergraph and gene hypergraph to obtain the inherent geometric structure of samples and genes; finally, the sample hypergraph and gene hypergraph are respectively used as the sample hypergraph regularization item and gene hypergraph regularization item of principal component analysis, and determine The objective function is optimized, and finally the sample clustering matrix and gene clustering matrix are respectively optimized by optimizing the objective function to obtain the final sample clustering and gene clustering. Based on the principle component analysis method, the present invention optimizes the double clustering through double hypergraph regularization, so as to better obtain the complex information in the tumor gene expression profile data on the basis of retaining the advantages of the principal component analysis method, Finally, the accuracy of clustering is improved.

Description

technical field [0001] The invention relates to a double clustering method for tumor gene expression profile data, in particular to a double hypergraph regularization based double clustering method for tumor gene expression profile data. Background technique [0002] So far, more than 100 different tumors have endangered human health. Sample types in tumor gene expression profiling data can be distinguished by the molecular patterns of gene activity in tumor cells. In recent years, with the rapid development of DNA microarray technology, researchers can observe the expression levels of thousands of genes at the same time, which can study tumor gene expression profile data more comprehensively. The current challenge of molecular biology is how to mine the important information contained in these tumor gene expression profile data to understand the biological process and mechanism of tumors. Due to the development of pattern recognition and machine learning, many effective m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16B40/30G16B25/10G06K9/62
CPCG06F18/23G06F18/2135
Inventor 王雪松刘健程玉虎
Owner CHINA UNIV OF MINING & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products