Protein-DNA binding residue prediction method based on sampling and integrated learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A prediction method and technology for binding residues, applied in proteomics, genomics, informatics, etc., can solve the problems of low prediction accuracy and information loss of the final model, enrich feature sources, prevent overfitting, reduce The effect of information loss

Inactive Publication Date: 2019-01-04

NANJING UNIV OF SCI & TECH

View PDF3 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, due to discarding unselected negative samples, random downsampling can easily cause information loss, resulting in low prediction accuracy of the final model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0015] The present invention will be further described below in conjunction with the accompanying drawings.

[0016]The accompanying drawing shows a schematic structural diagram of the prediction method system of the present invention. As shown in the accompanying drawings, according to an embodiment of the present invention, a method for predicting protein-DNA binding residues based on sampling and integrated learning includes the following steps: First, given a protein sequence set, use PSI-BLAST, PSIPRED, SANN and AAFD-BN algorithms extract the evolution information, predicted secondary structure information, predicted solvent accessibility information and amino acid frequency difference information of each protein sequence respectively; on this basis, combined with sliding window technology and serial The feature fusion technology represents the amino acid residues in the sequence in the form of feature vectors, and constructs a training sample set in units of residues. S...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a protein-DNA binding residue prediction method based on sampling and integrated learning. The method comprises the steps of (1) feature extraction and training sample set construction, (2) sampling and model training, (3) model integration, and (4) online prediction. The method is used for solving the shortcomings of low prediction precision caused by the problems of few feature types and class imbalance in protein-DNA binding residue prediction problems and has the advantages of high prediction precision and high generalization ability.

Description

technical field [0001] The invention relates to the field of bioinformatics prediction of protein-ligand binding residues, specifically, a protein with high precision and strong generalization ability based on a downsampling algorithm based on hyperplane distance and an improved self-adaptive lifting algorithm - DNA-binding residue prediction method. Background technique [0002] In cells, proteins often need to bind with DNA molecules to participate in various life activities, such as DNA replication, DNA repair and virus infection. Accurate identification of protein-DNA binding residues facilitates analysis of protein function and design of new drugs. Traditionally, researchers have utilized biochemical methods such as EMSAs, Fast ChIP, and X-ray crystallography to identify protein-DNA binding residues. However, such methods are time-consuming and expensive, and cannot meet the urgent needs of related research in the post-gene era where protein-DNA complexes are rapidly ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G16B20/00

Inventor 於东军朱一亨胡俊

Owner NANJING UNIV OF SCI & TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Protein-DNA binding residue prediction method based on sampling and integrated learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology