Method for predicting secondary structure of protein based on multiple evolution matrices

A technology of secondary structure and prediction method, applied in the fields of bioinformatics and traditional protein sequence analysis, it can solve the problems of difficult parameter selection of classifiers, poor reliability, lack of solutions, etc., and achieve simple and effective coding methods and high classification results. , the effect of improving the accuracy

Active Publication Date: 2017-07-14
QILU UNIV OF TECH
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Orthogonal encoding is to use 20-bit binary numbers to uniquely represent a certain amino acid, and satisfy the requirement that the orthogonal product of the encoding vector values ​​of different amino acids is 0. Although the encoding method is simple, but because it carries less biological information, it makes the protein secondary The accuracy of structure prediction is low; the Codon codon coding method "reduces" the amino acid into a composition of 3 bases, and the base is represented by a binary number, thereby realizing structure prediction; The relative probability of an amino acid type appearing at a position can carry biological evolution information to a certain extent
[0005] At present, the existing traditional protein structure prediction methods generally only consider the proportion of various amino acids in the protein sequence, and there are shortcomings: this method is relatively simple, but it does not take into account the position information of amino acids in the protein and the protein evolution process Acceptable point mutations in amino acids that occur in , lacking representation of biological evolution information
[0006] In summary, in the prior art, only amino acid composition is considered when predicting the secondary structure of amino acid residues in protein sequences, and the position information of amino acids in proteins and the acceptable point mutations of amino acids that occur during protein evolution cannot be fully considered; classification However, there is still a lack of effective solutions for problems such as difficult parameter selection and poor reliability.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for predicting secondary structure of protein based on multiple evolution matrices
  • Method for predicting secondary structure of protein based on multiple evolution matrices
  • Method for predicting secondary structure of protein based on multiple evolution matrices

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] It should be pointed out that the following detailed description is exemplary and intended to provide further explanation to the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

[0044] It should be noted that the terminology used here is only for describing specific implementations, and is not intended to limit the exemplary implementations according to the present application. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural, and it should also be understood that when the terms "comprising" and / or "comprising" are used in this specification, they mean There are features, steps, operations, means, components and / or combinations thereof.

[0045] As introduced in the background technology, in the prior art, only amino acid composition is considered w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a method for predicting a secondary structure of the protein based on multiple evolution matrices. The method comprises: downloading a protein NR database and a BLAST program local software package to generate a position-specific scoring matrix PSSM of a given protein sequence, carrying out parameter adjustment on the PSI-BLAST program to obtain evolution matrices of different divergence degrees of the protein sequence; processing all eigenvectors in the evolution matrices to form multiple evolution matrix features; taking the multiple evolution matrix features as input of a classifier and evaluating classification accuracy to obtain an optimization model; and for the protein with an unknown structure, inputting the optimization model, and predicting the secondary structure of the protein. According to the method disclosed by the present invention, for a protein sequence, multiple matrices with different evolutionary divergence degrees are simultaneously used to express the protein sequence, so that the protein structure information is more fully expressed, the possibility of residue replacement is considered more comprehensively, accuracy for predicting the secondary structure of the protein is improved, and the encoding method is simple and effective.

Description

technical field [0001] The invention relates to the technical fields of bioinformatics and traditional protein sequence analysis, in particular to a protein secondary structure prediction method based on multiple evolution matrices. Background technique [0002] Protein is the main bearer of life activities in organisms and the basis of all life activities. Its physiological functions are not only reflected in the composition of amino acids, but also in its spatial structure. Therefore, predicting protein structure is an important task in the field of bioinformatics. Since protein secondary structure is the link between protein primary structure and tertiary structure, it is also a key step in predicting its tertiary structure from primary structure. When the accuracy rate of protein secondary structure prediction reaches 80%, the three-dimensional spatial structure of a protein molecule can be accurately predicted. It can be seen that the prediction of protein secondary s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/22
CPCG16B30/00
Inventor 鹿文鹏杜月寒刘毅慧成金勇孟凡擎
Owner QILU UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products