Key protein recognition method based on tensor random walking

A technology of random walk and identification method, applied in the field of systems biology, can solve the problem of poor prediction performance of key proteins, and achieve the effect of good prediction performance

Active Publication Date: 2019-04-16
CHANGSHA UNIVERSITY
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a key protein identification method based on tensor random walk, to solve the technical defects of poor key protein prediction performance in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Key protein recognition method based on tensor random walking
  • Key protein recognition method based on tensor random walking
  • Key protein recognition method based on tensor random walking

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] see figure 1 , the present invention firstly provides a key protein identification method based on tensor random walk, comprising the following steps:

[0048] S1: Obtain protein interaction network topology, protein domain information, time-series-based gene expression information, and protein homology information.

[0049] The above three kinds of data all come from public databases on the Internet. Protein interaction networks derived from Saccharomyces cerevisiae (baker's yeast) have been well characterized by gene knockout experiments and have been widely used for the assessment of key proteins. The protein domain data were downloaded from the Pfam database, containing 1107 distinct domains involving 3,056 proteins in the PPI network. The gene expression data contains a total of 6,776 gene products (proteins) sampling data at 36 different times.

[0050] S2: According to the protein interaction network topology, protein domain information and gene expression inf...

Embodiment 2

[0088] In order to verify the effectiveness of the key protein identification method proposed in the present invention, we run this method and other ten current key protein identification methods on the yeast protein interaction network. The protein interaction network used for the experiments is derived from the DIP database, which consists of 5,023 proteins and 22,570 edges. Self-interactions and repeated interactions have been removed from the network. The gene expression data of yeast contains the sampling data of 6,776 gene products (proteins) at 36 different times. Of the 6,776 proteins, 4,902 proteins were included in the DIP dataset. image 3It is the method proposed by the present invention and other ten key protein prediction methods DC, IC, BC, CC, SC, NC, CoEWC, Pec, POEM, ION respectively predict the top 100, 200, 300, 400, 500, 600 key proteins Accuracy comparison chart of (ie n=100, 200, 300, 400, 500, 600).

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a key protein recognition method based on tensor random walking. The method comprises the steps that a protein mutual impacting network topological structure, protein structural domain information, genetic expression information based on time sequence and protein homologous information are obtained; a correlation relationship of different protein nodes in the mutual impacting of protein nodes is constructed according to the above information; hub scores of protein nodes are initialized according to the protein homologous information; a tensor model is established according to the correlation relationship of different protein nodes in the protein mutual impacting; the hub scores of each protein nodes which are obtained by iterative computation based on the tensor model are sorted, and the top n protein nodes are taken as key proteins. The method has the advantages that the method is simple and effective, and compared with other methods, tests on multiple data sets show that the method has better prediction performance on key protein recognition.

Description

technical field [0001] The invention relates to the field of systems biology, in particular to a key protein identification method based on tensor random walk. Background technique [0002] Protein is an essential component of all cell and tissue structures, and the most important material basis for life activities. However, the importance of different proteins to life activities is not the same. Usually those proteins that cause the loss of function of the protein complex after being deleted and cause the organism to fail to survive or develop are called key proteins. Key proteins are not only necessary for the survival and reproduction of organisms, but also play an important role in life activities. Therefore, the identification of key proteins helps to understand the internal organization and process of life activities at the system level. At the same time, a large number of studies have shown that key proteins (genes) are often disease-causing genes. It can be seen ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B5/00G16B50/20G16B25/10
Inventor 赵碧海胡赛王雷李学勇张帆田清龙
Owner CHANGSHA UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products