Method and system for classification modeling based on protein length and DCNN
A modeling method and protein technology, applied in the field of classification modeling method and system based on protein length and DCNN
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0072] The classification modeling method based on protein length and DCNN of the present invention comprises the following steps:
[0073] Step 1: Obtain multiple data sets as the training set, each data set includes multiple proteins, extract the PSSM features generated by PSI-Blast in the data set, and convert the format of the PSSM features by setting different sliding windows;
[0074] Step 2: Group the proteins in the training set based on the length of the protein to obtain multiple model groups;
[0075] Step 3: For each model group, construct a prediction model corresponding to the model group based on the deep convolutional network, and train the prediction model through the model group to obtain a trained prediction model.
[0076] Among them, the data set selected in the first step is a classic data set for protein secondary structure prediction. In this embodiment, the data set AstraCull with 15666 protein pieces synthesized from Astrall and CullPDB data is used a...
Embodiment 2
[0112] The classification modeling system based on protein length and DCNN of the present invention includes an input module, a format conversion module, a grouping module and a model training module.
[0113] The input module is used to obtain multiple data sets as training sets, and each data set includes multiple proteins. The selected data set is a classic data set for protein secondary structure prediction. In this example, Astrall and CullPDB data were synthesized into a data set AstraCull with 15666 protein entries.
[0114] The format conversion module is used to extract the PSSM features generated by PSI-Blast in the data set, and perform format conversion on the PSSM features through the sliding window. In the format conversion module, the 20-bit PSSM feature generated by PSI-Blast in the above data set is extracted, and after the format conversion of the PSSM feature through a sliding window with a value of 13, the feature of each amino acid in the training set is a...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com