The Development and Testing of Machine Learning Applications into Genetically-Based Disease Detection

Authors

  • Aditya Mittal Student
  • Kuan-Chen Wu

DOI:

https://doi.org/10.47611/jsrhs.v11i3.2682

Keywords:

neural network, machine learning, artificial intelligence, genetics, RNA, peripheral blood mononuclear cells, Gene Expression Omnibus, gene expression, multiple sclerosis, data science

Abstract

This study uses publicly available gene-expression from peripheral blood mononuclear cells fed into a logistically trained machine learn model to accurately predict the probability of early onset of Multiple Sclerosis by identifying biomarkers in genetic expression and establishes logistic regression as a viable methodology for genetic analysis to predict disease. Current detection methodology of neurological diseases such as MRI scans of existing lesions are impractical solutions when it comes to alleviating most of a patient’s symptoms, as they rely on the disease to have already developed to detect it. Machine learning is a rapidly emerging tool that has much potential in not only disease detection, but early onset diagnosis as well. This study utilized the NEO Gene Expression Omnibus data repository to selectively identify key PBMC gene expression datasets to feed into a logistically trained model. Data filtration by Log-Fold Change analysis and p-Value importance allowed for data simplification to reduce model dimensionality, improve model accuracy, and even identify important gene markers in Multiple Sclerosis. Nearly 33,000 genes were eliminated through extensive data filtration, and 15 genes were marked as statistically significant in the development of Multiple Sclerosis. Model accuracy produced was nearly 100%, though lack of representative data highlights the need for further testing. The methodology in this experiment from the data accumulation to the actual construction and testing of the model itself serves as strong representation of the value artificial intelligence can have in the field of genomic analysis in disease detection.

Downloads

Download data is not yet available.

References or Bibliography

Berer, Kerstin et al. (18/04/2014). Microbial view of central nervous system autoimmunity. FEBS Letters. https://doi.org/10/1016/j.febslet.2014.04.007. Retrieved: 01/16/2022.

Podbielska, Maria. (08/08/2020). Distinctive Sphingolipid patters in chronic multiple sclerosis lesions. Research Gate. https://www.researchgate.net/figure/Plaques-morphology-in-MS-cases-examined-Tissue-sections-were-stained-with-Luxol-fast_fig1_343519143. Retrieved: 01/16/2022.

Ramteke, Rakesh et al. (03/12/2012). Automatic Medical Image Classification and Abnormality Detection Using K Nearest Neighbor. Research Gate. https://www.researchgate.net/publication/305403850. Retrieved: 01/16/2022.

Zhang, Yudong. (09/06/2016). Comparison of machine learning methods for stationary wavelet entropy-based multiple sclerosis detection. Sage Journals. https://journals.sagepub.com/doi/abs/10.1177. Retrieved: 01/16/2022

National, Multiple Sclerosis Soc. (05/08/2017). What Causes MS? National Multiple Sclerosis Society. https://www.nationalmssociety.org/What-is-MS/What-Causes-MS Retrieved: 01/16/2022.

Acquaviva, Massimo et al. (21/07/2020). Inferring Multiple Sclerosis Stages from the Blood Transcriptome via Machine Learning. National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/33205062/. Retrieved: 01/16/2022.

Faria, Omar et al. (22/01/2013). MicroRNA dysregulation in multiple sclerosis. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3551282/. Retrieved: 01/16/2022.

Nehal M, Ali et al. (10/10/2020). Machine Learning In Early Genetic Detection of Multiple Sclerosis Disease: A Survey. International Journal of Computer Science and Information Technology. https://aircconline.com/ijcsit/V12N5/12520ijcsit01.pdf. Retrieved: 01/17/2022

Institution, Genome. (18/01/2017). Microarray Technology. National Human Genome Research Institution. https://www.genome.gov/genetics-glossary/Microarray-Technology. Retrieved: 01/17/2022

Institution, Genome. (18/01/2017). Microarray Technology. National Human Genome Research Institution. https://www.genome.gov/genetics-glossary/Microarray-Technology. Retrieved: 01/17/2022

Moore, Craig S. (22/01/2013). MicroRNA dysregulation in multiple sclerosis. Frontiers In Genetics. https://doi.org.10.3389/fgene.2012.00311. Retrieved: 01/16/2022.

Published

08-31-2022

How to Cite

Mittal, A., & Wu, K.-C. (2022). The Development and Testing of Machine Learning Applications into Genetically-Based Disease Detection. Journal of Student Research, 11(3). https://doi.org/10.47611/jsrhs.v11i3.2682

Issue

Section

HS Research Projects