A protein structure prediction method and device

A protein structure and prediction method technology, which is applied in the field of protein structure prediction based on multi-task time-domain convolutional neural network, can solve the problems of poor robustness and low accuracy, so as to improve the degree of fit, reduce the complexity, improve the The effect of generalization

Active Publication Date: 2021-03-23
WUHAN GENECREATE BIOLOGICAL ENG CO LTD
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The problem of low accuracy and poor robustness of the existing protein structure prediction in the present invention, in the first aspect of the present invention provides a protein structure prediction method based on multi-task time-domain convolutional neural network, including the following steps: Gene sequence, and protein database; according to the genetic code table and protein database, establish a DNA-RNA-amino acid triple sequence data set corresponding to each protein; according to the residue depth and physical and chemical properties of the amino acids that make up each protein in the protein database Establishing a multiple regression equation to obtain the statistical depth features of each protein; clustering the ternary sequence data set through gene homology information and evolution rate and mapping it into a multidimensional feature vector; combining the multidimensional feature vector, protein The statistical depth feature is used as the input of the multi-task time-domain convolutional neural network for training the multi-task time-domain convolutional neural network until the output error of the multi-task time-domain convolutional neural network is lower than the threshold and tends to be stable Stop training at any time to obtain the trained multi-task time-domain convolutional neural network; input the target gene sequence into the trained multi-task time-domain convolutional neural network to obtain the statistical depth characteristics of the target amino acid sequence and its corresponding protein ; According to the statistical depth features of the amino acid sequence and its corresponding protein, the protein structure is predicted by using the existing protein morphological features and the ball rolling method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A protein structure prediction method and device
  • A protein structure prediction method and device
  • A protein structure prediction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0029] refer to Figure 1 to Figure 3, in the first aspect of the present invention, a protein structure prediction method based on multi-task time-domain convolutional neural network is provided, comprising the following steps: S101. Obtaining the target gene sequence and protein database; S102. According to the genetic code table and protein The database establishes a DNA-RNA-amino acid triple sequence data set corresponding to each protein; according to the residue depth and physical and chemical properties of the amino acids that make up each protein in the protein database, a multiple regression equation is established to obtain the statistical depth characteristics of each protein ; S103. Clustering the triple...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to a protein structure prediction method and device based on a multi-task time-domain convolutional neural network, the method comprising: acquiring a target gene sequence and a protein database; establishing a DNA-corresponding protein for each protein according to a genetic code table and a protein database RNA-amino acid ternary sequence data set; establish a multiple regression equation according to the residue depth and physicochemical properties of amino acids in the protein database to obtain the statistical depth characteristics of each protein; cluster and map the ternary sequence data set It is a multi-dimensional feature vector; the multi-dimensional feature vector and the statistical depth feature of the protein are used as the input of the multi-task time-domain convolutional neural network, and the multi-task time-domain convolutional neural network is trained; the protein structure is predicted using the statistical depth feature of the protein . The present invention combines statistical depth features of proteins with multi-task time-domain convolutional neural networks, reduces the complexity of the model, and improves the generalization and fitting degree.

Description

technical field [0001] The invention relates to the field of biological information and deep learning, in particular to a protein structure prediction method and device based on a multi-task time-domain convolutional neural network. Background technique [0002] It is currently recognized in biology that the biological function of a protein is determined by its three-dimensional structure; the three-dimensional structure of a protein is determined by its primary structure; proteins with similar functions are also similar in structure. [0003] Studies have found that although the primary structure of proteins is ever-changing, that is, there are many types of amino acid arrangements and combinations in a polypeptide chain, the types of secondary structures are limited, mainly including α -spiral( α -helix), β-sheet (β-sheet), β-turn (β-turn) and random coil (random coil), where α The two protein secondary structures, the helix and the β sheet, only depend on the backbone o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B15/00G16B40/00G06N3/04
CPCG16B15/00G16B40/00G06N3/045
Inventor 华权高赵海义舒芹
Owner WUHAN GENECREATE BIOLOGICAL ENG CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products