Biomarker identification method based on multiple networks

A technology of biomarkers and identification methods, applied in the fields of biological systems, bioinformatics, neural learning methods, etc., can solve problems such as the inability to reflect the characteristics and topology structure well, and the mutual interference of data.

Active Publication Date: 2020-02-14
CENT SOUTH UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, directly integrating a variety of different biological interaction information into a network may have problems such as mutual interference of different types of data, and the inability to well reflect the characteristics and topology of each type of network.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Biomarker identification method based on multiple networks
  • Biomarker identification method based on multiple networks
  • Biomarker identification method based on multiple networks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] 1. Preprocessing of gene expression data

[0028] Read in a gene expression data file and normalize the gene expression data by Z-score:

[0029]

[0030] x represents the original gene expression value of each sample; μ represents the mean of all gene original expression data of each sample; σ is the standard deviation of all gene original expression data of each sample.

[0031] 2. Principal component analysis of gene expression data

[0032] Based on the standardized gene expression data, the specific process of obtaining the first two principal components of the gene expression matrix through principal component analysis is as follows:

[0033] 1) Find the covariance matrix of the features in the standardized gene expression data;

[0034] 2) Find the eigenvalues ​​and corresponding eigenvectors of the covariance matrix;

[0035] 3) Sort the eigenvalues ​​in descending order, select the two largest ones, and then use the two corresponding eigenvectors as colum...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a biomarker identification method based on multiple networks. Considering the influence of sample heterogeneity, the method disclosed by the invention comprises the following steps of: firstly, standardizing gene expression profile data, carrying out principal component analysis on samples, and clustering the samples through a Gaussian mixture model by utilizing the first two principal components, constructing a network propagation model based on multiple networks to sort all genes in the networks as for each type of samples for preliminarily screening important genes,and in order to obtain the biomarker with the maximum distinguishing capability and the minimum redundancy, further grading and sequencing the genes in the important characteristics obtained in the previous step through an area under curve (AUC) optimization model of a receiver operation characteristic curve to obtain the biomarker. According to the method, the multi-source biological network information is fully utilized, and the biomarker with the maximum distinguishing capacity, the minimum redundancy and the biointerpretability can be effectively recognized and used for heterogeneous complex disease analysis.

Description

technical field [0001] The invention relates to the field of bioinformatics, in particular to a multi-network-based biomarker identification method. Background technique [0002] Complex diseases have strong heterogeneity and are easily affected by environmental factors, which brings difficulties to the diagnosis and treatment of complex diseases. Therefore, the analysis of heterogeneous and complex diseases has become one of the focuses of modern medical research. Biomarkers are indicators for objectively measuring and evaluating normal biological processes, pathological processes, or drug intervention responses, and are also important early warning indicators when the body is damaged. Mining effective biomarkers from diverse biological data is the key to solving complex diseases. [0003] With the in-depth study of systems biology and the rapid development of high-throughput technologies, a large number of biological interaction networks have been obtained, such as prote...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16B25/10G16B5/00G06K9/62G06N3/04G06N3/08
CPCG16B25/10G16B5/00G06N3/08G06N3/045G06F18/23G06F18/2135Y02A90/10
Inventor 李幸一李敏项炬王建新
Owner CENT SOUTH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products