Ligand molecule massive characteristic screening method in drug design

A ligand molecule and feature screening technology, which is applied in molecular design, calculation, chemical statistics, etc., can solve the problems of high time consumption and achieve the effect of increasing comprehensibility and improving learning efficiency

Active Publication Date: 2017-05-31
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Considering that the characteristic dimension of the ligand molecule is very likely to be very large, the traditional LASSSO method has a large time cost and it is difficult to solve this problem well.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Ligand molecule massive characteristic screening method in drug design
  • Ligand molecule massive characteristic screening method in drug design
  • Ligand molecule massive characteristic screening method in drug design

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] The present invention will be further described in detail below in conjunction with the accompanying drawings.

[0019] figure 1 It is a framework diagram of the system of the present invention. Based on the framework, the present invention provides a method for screening a large number of features of LASSO ligands based on the EDPP criterion. The specific implementation steps of the method include the following:

[0020] Step 1: Ligand molecule ECFP feature generation. Given an initial dataset in is the atomic connectivity graph of each molecule, Y i is the label for each sample. The initial data set is processed to obtain the ECFP characteristics describing the sample, that is, the data set D t ={(X i ,Y i )|X i ∈ R 1*m ,1≤i≤n}.

[0021] Step 2: Ligand molecular feature screening based on EDPP LASSO method. For data set D t , applying the EDPP criterion, for satisfying conditions (λ∈(0,λ 0 ]) of λ={λ i |0≤ii >λ i+1}, get the feature screening result o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a ligand molecule massive characteristic screening method in drug design. In the drug molecule virtual screening based on the ligand, dimensionality (every dimensional characteristic represents a substructure) of the ligand molecule fingerprint feature generated by using the most popular ECFP method at present is massive and even ten thousands of dimensions due to the large number of ligand molecule, so the method can meet the problem of 'dimensionality disaster' in actual task. The method screens the massive ECFP molecule fingerprint features by using the LASSO method based on EDPP rule, and acquires related characteristics of the ligand molecule by using a robustness selecting method. The activity of the ligand molecule is often related to a few number of substructure; the method can rapidly and largely remove uncorrelated characteristics, select related characteristics of robustness, solve the problem of the 'dimensionality disaster', acquire the substructure related to the ligand activity and push the wider application of the ECFP method in the drug design.

Description

technical field [0001] The invention relates to a method for screening ligand molecular features based on machine learning, and belongs to the technical field of computer-aided drug design. Background technique [0002] In recent years, how to improve the effectiveness of drug virtual screening has become an urgent problem for pharmaceutical companies. Since a large number of biochemical experiments provide sufficient data, machine learning methods can just use these data to help solve problems. [0003] Drug virtual screening can be divided into two types: target structure-based and ligand-based methods. The virtual screening of drugs based on the target structure simulates the physical interaction between the compound and the target to determine whether there may be a drug effect, such as the molecular docking method. Ligand-based methods mainly use existing data to predict the activity of compounds when the target structure is unknown. The key to this type of method is...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/00
CPCG16C20/50G16C20/70
Inventor 吴建盛张邱鸣胡海峰
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products