Application of Machine Learning in Drug Discovery




Journal Title

Journal ISSN

Volume Title



Drug Discovery is a highly complicated process. On average, it takes 6 to 12 years to manufacture a drug and have the product released in the market. Even after a huge investment of money, time and hard work, one cannot assure the success of the drug after its release. The recent advancement in the field of machine learning helps us to reduce the risk in this field of science. This thesis aims at analyzing the applications of machine learning in the field of bio-medical science. Usage of a simpler organism for the implementation of the experiments is highly convenient. Therefore, a machine learning model to predict the chemical compounds effect on aging of Caenorhabditis elegans was proposed using the Drug Age database. This database includes the features of Molecular Descriptors and Gene Ontology. In this work, a new feature selection scheme is proposed for an efficient classification task using random forests. We explain the benefits of our feature selection method in comparison with the base-line support vector machine and artificial neural network classifiers. Secondly, another application of machine learning which is presented in the work is the prediction of Drug-Target Interaction using Weisfeiler-Lehman Neural Machine. Prediction of a possible interaction between a drug and a target enables the biochemists to speed up the process of target validation and discovery. A public-domain data set which corresponds to four different target protein types is used for the analysis purpose. The algorithm aims at creating a subgraph from the network formed by the drugs and targets which is then taken through graph labeling, resulting in the formation of an adjacency matrix. This matrix defines the presence of an interaction used for training a model. The results of the proposed method out performed the standard state of art approaches like the similarity based methods in terms of AUC.



Aging, Caenorhabditis elegans, Drug development, Machine learning, Drug targeting, Biomedical engineering