Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Molecular screening method based on deep learning technology qualitative and quantitative model

A quantitative model and deep learning technology, applied in the fields of material crossover, deep learning and physics, and chemistry, to achieve high-precision molecular prediction, solve limitations, and accurately screen

Pending Publication Date: 2021-11-19
NANJING UNIV OF TECH
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, the application of machine learning and deep learning in screening new molecules is limited to a certain extent.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Molecular screening method based on deep learning technology qualitative and quantitative model
  • Molecular screening method based on deep learning technology qualitative and quantitative model
  • Molecular screening method based on deep learning technology qualitative and quantitative model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] A molecular screening method based on qualitative and quantitative models of deep learning technology, the realization process is as follows figure 1 shown. It includes the following steps: construction and preprocessing of molecular data sets, feature engineering of data sets, construction and training of qualitative and quantitative models, deployment of model prediction and screening.

[0055] 1. The construction and preprocessing of molecular data sets, specifically including the following steps:

[0056] (1) Convert the molecular structure formula to SMILES. First obtain the molecule used for training the model and its molecular structural formula, according to the SMILES conversion rule developed by DAVID WEININGER in 1988, the perovskite FAPbI 3 Taking additives in light-emitting diodes as an example, the structural formula of each additive molecule is converted into SMILES.

[0057] (2) Transform SMILES into qualitative and quantitative descriptors, respectiv...

Embodiment 2

[0069] When constructing a qualitative and quantitative model, FP2 fingerprints and 65 attributes are selected for joint input, and a five-layer network is set in the model such as Figure 4 As shown, the specific construction and training of qualitative and quantitative models refer to the description in Example 1.

[0070] At this time, the verification accuracy of the qualitative and quantitative model reaches 85.71%. Compared with the FP2 fingerprint input in the DNN model, the verification accuracy can only reach 75.00%, and the accuracy of the qualitative and quantitative model has increased by 10.71%. The contrast accuracy at a glance is as follows Figure 8 shown.

Embodiment 3

[0072] When building a qualitative and quantitative model, the joint input of MACCS fingerprint and 65 attributes is selected, and a five-layer network is set in the model such as Figure 4 As shown, the specific construction and training of qualitative and quantitative models refer to the description in Example 1.

[0073] At this time, the verification accuracy of the qualitative and quantitative model reaches 85.71%. Compared with the verification accuracy of 71.43% when inputting MACCS fingerprints into the DNN model, the accuracy of the qualitative and quantitative model is increased by 14.28%. Comparison Accuracy at a Glance Figure 9 shown.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a molecular screening method based on a deep learning technology qualitative and quantitative model (EMIM), which comprises the following steps: converting collected molecules into SMILES, and then converting the SMILES into pretreatment of qualitative and quantitative descriptors; carrying out hash mapping, variance threshold and Pearson's correlation coefficient feature engineering on qualitative and quantitative descriptors; setting a qualitative and quantitative model according to input data processed by feature engineering, using a Sigmoid function as final output, optimally setting parameters of the qualitative and quantitative model through a back propagation optimization algorithm to perform performance evaluation, and performing iterative training to obtain a model with high verification precision; enabling a to-be-predicted molecule to be subjected to preprocessing and feature engineering and then inputting the to-be-predicted molecule into the highest verification precision qualitative and quantitative model (EMIM) for prediction screening, and obtaining a prediction result of the to-be-predicted molecule. According to the method, more efficient and accurate screening of new molecules is realized, and the limitation of traditional screening is solved.

Description

technical field [0001] The present invention relates to the intersection fields of deep learning and physics, chemistry, and materials, and in particular relates to a qualitative and quantitative model built using a deep learning neural network framework, which can be used for molecular screening. Background technique [0002] Molecular descriptors are the characterization of molecular structure sub-fragments or physical and chemical properties, and can be divided into qualitative descriptors and quantitative descriptors. Qualitative descriptors generally refer to molecular fingerprints, which can convert chemical molecules into bit strings containing only 0 and 1. Quantitative descriptors describe molecules based on molecular composition (number of hydrogen bond donors, number of benzene rings), physical and chemical properties (topological polar surface area, octanol-water partition coefficient) and experimental data information (ultraviolet spectrum, solvent ratio), etc. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16C10/00G16C20/70G16C20/90G06N3/04G06N3/08
CPCG16C10/00G16C20/70G16C20/90G06N3/084G06N3/045
Inventor 王建浦朱琳章亮
Owner NANJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products