A Novel ML Approach to Detect Possible Protein Allosteric Sites for Drug Discovery

Authors

DOI:

https://doi.org/10.47611/jsrhs.v13i3.6909

Keywords:

Enzyme, Protein, Allosteric Site, Active Site, Active, Allosteric, Inhibition, Competitive, Noncompetitive, Random Forest, Neural Network, Machine Learning, Artificial Intelligence

Abstract

In recent years, allosteric sites on proteins, located away from the primary active binding site, have gained prominence as promising drug targets because of the significant control they provide over biological pathways. The process of identifying allosteric sites through methods such as x-ray crystallography is often time-consuming, where the protein structure and the structure of potential allosteric-site binding molecules are analyzed to discover the allosteric site. Thus, machine learning models have been applied to expedite the process of allosteric drug discovery by contributing to allosteric site identification. In this study, I investigated different machine learning models and evaluated their performance and found that the best models were the random forest classifier and a 5-layer neural network. The random forest had an accuracy of 98.89%, a precision score of 98.20%, a recall score of 99.69%, an F1 score of 98.94%, and an AUC of .9991. The neural network had an accuracy of 97.70%, a precision score of 99.15%, a recall score of 96.04%, an F1 score of 97.57%, and an AUC of .9939. The high accuracy of these models along with their efficiency in running could potentially help the pharmaceutical industry identify these allosteric sites which would help with the allosteric drug manufacturing process.

Downloads

Download data is not yet available.

References or Bibliography

(PDF) Enzymes: principles and biotechnological applications. (n.d.). ResearchGate. https://www.researchgate.net/publication/283779912_Enzymes_principles_and_biotechnological_applications

Bewick, V., Cheek, L., & Ball, J. (2005). Statistics review 14: Logistic regression. Critical Care, 9(1), 112. https://doi.org/10.1186/cc3045

Buntz, B. (2023, September 27). Why allosteric drugs represent a unique small molecule approach. Drug Discovery and Development. https://www.drugdiscoverytrends.com/allosteric-drugs-a-differentiated-small-molecule-approach/

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16(16), 321–357. https://doi.org/10.1613/jair.953

Günther, S., Reinke, P. Y. A., Fernández-García, Y., Lieske, J., Lane, T. J., Ginn, H. M., Koua, F. H. M., Ehrt, C., Ewert, W., Oberthuer, D., Yefanov, O., Meier, S., Lorenzen, K., Krichel, B., Kopicki, J.-D., Gelisio, L., Brehm, W., Dunkel, I., Seychell, B., & Gieseler, H. (2021). X-ray screening identifies active site and allosteric inhibitors of SARS-CoV-2 main protease. Science. https://doi.org/10.1126/science.abf7945

Huang, W., Nussinov, R., & Zhang, J. (2017). Computational Tools for Allosteric Drug Discovery: Site Identification and Focus Library Design. Methods in Molecular Biology (Clifton, N.J.), 1529, 439–446. https://doi.org/10.1007/978-1-4939-6637-0_23

Jarvis, L. (2019, March 10). Drug hunters explore allostery’s advantages. Chemical & Engineering News. https://cen.acs.org/pharmaceuticals/drug-development/Drug-hunters-explore-allosterys-advantages/97/i10#:~:text=An%20allosteric%20site%20is%20more

keras-team. (2019, January 22). keras-team/keras. GitHub. https://github.com/keras-team/keras

Lecat-Guillet, N., Monnier, C., Rovira, X., Kniazeff, J., Lamarque, L., Zwier, J. M., Trinquet, E., Pin, J.-P., & Rondard, P. (2017). FRET-Based Sensors Unravel Activation and Allosteric Modulation of the GABAB Receptor. Cell Chemical Biology, 24(3), 360–370. https://doi.org/10.1016/j.chembiol.2017.02.011

Liashchynskyi, P., & Liashchynskyi, P. (2019). Grid Search, Random Search, Genetic Algorithm: A Big Comparison for NAS.

Ni, D., Wei, J., He, X., Ashfaq Ur Rehman, Li, X., Qiu, Y., Pu, J., Lu, S., & Zhang, J. (2021). Discovery of cryptic allosteric sites using reversed allosteric communication by a combined computational and experimental strategy. Chemical Science, 12(1), 464–476. https://doi.org/10.1039/d0sc05131d

Pedregosa, F., Varoquaux, G., Michel, V., Thirion, B., Grisel, O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Gramfort, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python Pedregosa, Varoquaux, Gramfort et al. Journal of Machine Learning Research, 12, 2825–2830.

Qi, Y., Wang, Q., Tang, B., & Lai, L. (2012). Identifying Allosteric Binding Sites in Proteins with a Two-State Go̅ Model for Novel Allosteric Effector Discovery. Journal of Chemical Theory and Computation, 8(8), 2962–2971. https://doi.org/10.1021/ct300395h

Tin Kam Ho. (1995, August 1). Random decision forests. IEEE Xplore. https://doi.org/10.1109/ICDAR.1995.598994

Wagner, J., Lee, C. S., Durrant, J. D., Malmstrom, R. D., Feher, V. A., & Amaro, R. E. (2016). Emerging Computational Methods for the Rational Discovery of Allosteric Drugs. Chemical Reviews, 116(11), 6370–6390. https://doi.org/10.1021/acs.chemrev.5b00631

Xiao, S., Tian, H., & Tao, P. (2022). PASSer2.0: Accurate Prediction of Protein Allosteric Sites Through Automated Machine Learning. Frontiers in Molecular Biosciences, 9. https://doi.org/10.3389/fmolb.2022.879251

Published

08-31-2024

How to Cite

Susarla, S. (2024). A Novel ML Approach to Detect Possible Protein Allosteric Sites for Drug Discovery. Journal of Student Research, 13(3). https://doi.org/10.47611/jsrhs.v13i3.6909

Issue

Section

HS Research Projects