A stochastic
algorithm has been developed for predicting the
drug-likeness of molecules. It is based on optimization of ranges for a set of descriptors. Lipinski's “rule-of-5”, which takes into account molecular weight, logP, and the number of
hydrogen bond donor and
acceptor groups for determining
bioavailability, was previously unable to distinguish between drugs and non-drugs with its original set of ranges. The present invention demonstrates the
predictive power of the stochastic approach to differentiate between drugs and non-drugs using only the same four descriptors of Lipinski, but modifying their ranges. However, there are better sets of 4 descriptors to differentiate between drugs and non-drugs, as many other sets of descriptors were obtained by the stochastic
algorithm with more
predictive power to differentiate between databases (drugs and non-drugs). A set of optimized ranges constitutes a “filter”. In addition to the “best” filter, additional filters (composed of different sets of descriptors) are used that allow a new definition of “
drug-like” character by combining them into a “
drug like index” or DLI. In addition to producing a DLI (drug-like index), which permits discrimination between populations of drug-like and non-drug-like molecules, the present invention may be extended to be combined with other known drug screening or optimizing methods, including but not limited to, high-
throughput screening,
combinatorial chemistry,
scaffold prioritization and docking.