Detecting and Validating the Emotions in Alzheimer’s Patients by Voice Analysis, Computer Vision and Deep Learning

Reetu Jain; Krishnni Khanna

doi:10.47611/jsrhs.v13i1.6303

Authors

Reetu Jain On My Own Technology
Ms Westwood High School United States

DOI:

https://doi.org/10.47611/jsrhs.v13i1.6303

Keywords:

Alzheimer's disease, Mood detection, Deep learning, Convolutional Neural Network (CNN), Neurodegenerative disorder, Cognitive decline

PDF

Abstract

Alzheimer’s disease is a degenerative disorder of the brain that affects memory and cognitive function, and is becoming more prevalent as the population ages. Alzheimer causes brain cells of a person to die, and with time the brain works less. As a result of this there is a change in the behavior and personality of Alzheimer patients. It is observed that the patients very often suffer from fluctuating mood. Due to the change in mood of the patient, the caretaker or the attendant of the patient is unable to distinguish what triggered the changes, and provide proactive support or timely intervention. Hence the study is undertaken to detect the mood of the Alzheimer’s patient employing the voice analysis, computer vision and deep learning. This paper comprises of two phases. The first phase detects the emotions of the patients in real time with the help of computer vision (CV), voice analysis (VA) and Convolutional Neural Network (CNN). Two CNN models were trained, first with the attributes extracted from the image and second the features extracted from the voice dataset. Then the predictions are compared to get the result from the proposed model. The second phase of the paper examines the practicality of the proposed approach by applying it to detect the emotions in four Alzheimer’s patients. Finally, the results are compared and validated. Overall, the proposed model holds promise as a valuable tool in the real time detection of emotions in Alzheimer patient, enabling timely intervention and improved patient outcomes.

Downloads

References or Bibliography

Mecocci, Patrizia, and Maria Cristina Polidori. Antioxidant clinical trials in mild cognitive impairment and Alzheimer's disease. Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease 1822.5 (2012): 631-638. https://doi.org/10.1016/j.bbadis.2011.10.006

Hort, J. O. B. J., O’brien, J. T., Gainotti, G., Pirttila, T., Popescu, B. O., Rektorová, I., ... & EFNS Scientist Panel on Dementia. (2010). EFNS guidelines for the diagnosis and management of Alzheimer’s disease. European Journal of Neurology, 17(10), 1236-1248. https://doi.org/10.1111/j.1468-1331.2010.03040.x

Leung, D. K., Chan, W. C., Spector, A., & Wong, G. H. (2021). Prevalence of depression, anxiety, and apathy symptoms across dementia stages: A systematic review and meta‐analysis. International Journal of Geriatric Psychiatry, 36(9), 1330-1344. DOI:10.3233/JAD-200739

Fraser, K. C., Meltzer, J. A., & Rudzicz, F. (2016). Linguistic features identify Alzheimer’s disease in narrative speech. Journal of Alzheimer's Disease, 49(2), 407-422. DOI:10.3233/JAD-150520

Al-Hameed, S., Benaissa, M., & Christensen, H. (2016, September). Simple and robust audio-based detection of biomarkers for Alzheimer’s disease. In 7th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT) (pp. 32-36). DOI:10.21437/SLPAT.2016-6

Fraser, K. C., Meltzer, J. A., & Rudzicz, F. (2016). Linguistic features identify Alzheimer’s disease in narrative speech. Journal of Alzheimer's Disease, 49(2), 407-422. DOI:10.3233/JAD-150520

El-Sappagh, S., Alonso, J. M., Islam, S. M., Sultan, A. M., & Kwak, K. S. (2021). A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer’s disease. Scientific reports, 11(1), 1-26. DOI:10.1038/s41598-021-82098-3

Clark, H. H., & Tree, J. E. F. (2002). Using uh and um in spontaneous speaking. Cognition, 84(1), 73-111.

Lee, T. R., Kim, G. H., & Choi, M. T. (2022). Identification of Geriatric Depression and Anxiety Using Activity Tracking Data and Minimal Geriatric Assessment Scales. Applied Sciences, 12(5), 2488 DOI:10.3390/app12052488

Li, J., Deng, L., Haeb-Umbach, R., & Gong, Y. (2015). Robust automatic speech recognition: a bridge to practical applications. DOI:10.1109/TASLP.2022.3153265

Shen, P., Changjun, Z., & Chen, X. (2011, August). Automatic speech emotion recognition using support vector machine. In Proceedings of 2011 international conference on electronic & mechanical engineering and information technology (Vol. 2, pp. 621-625). IEEE. DOI:10.1109/EMEIT.2011.6023178

Kim, J., Kim, S., Kong, J., & Yoon, S. (2020). Glow-tts: A generative flow for text-to-speech via monotonic alignment search. Advances in Neural Information Processing Systems, 33, 8067-8077. DOI:10.3390/app12052490

Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: an overview and application in radiology. Insights into imaging, 9, 611-629. DOI:10.1007/s13244-018-0639-9

Livingstone, S. R., & Russo, F. A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), e0196391. DOI:10.1371/journal.pone.0196391

Jackson, P., & Haq, S. (2014). Surrey audio-visual expressed emotion (savee) database. University of Surrey: Guildford, UK.FER-2013, https://www.kaggle.com/datasets/msambare/fer2013 Extracted on 20th February 2023. DOI:10.20944/preprints202309.1202.v1

Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: an overview and application in radiology. Insights into imaging, 9, 611-629. https://doi.org/10.1007/s13244-018-0639-9

Shyam, R. (2021). Convolutional neural network and its architectures. Journal of Computer Technology & Applications, 12(2), 6-14p. DOI:10.37591/JoCTA

Cen, L. P., Ji, J., Lin, J. W., Ju, S. T., Lin, H. J., Li, T. P., ... & Zhang, M. (2021). Automatic detection of 39 fundus diseases and conditions in retinal photographs using deep neural networks. Nature communications, 12(1), 4828. DOI:10.1038/s41467-021-25138-w

Dabiri, S., & Heaslip, K. (2018). Inferring transportation modes from GPS trajectories using a convolutional neural network. Transportation research part C: emerging technologies, 86, 360-371. https://doi.org/10.1016/j.trc.2017.11.021

Mishra, A., & Shukla, A. (2020, December). Psychological determinants of consumer’s usage, satisfaction, and word-of-mouth recommendations toward smart voice assistants. In Re-imagining Diffusion and Adoption of Information Technology and Systems: A Continuing Conversation: IFIP WG 8.6 International Conference on Transfer and Diffusion of IT, TDIT 2020, Tiruchirappalli, India, December 18–19, 2020, Proceedings, Part I (pp. 274-283). Cham: Springer International Publishing. DOI:10.1007/978-3-030-64849-7_24

ŞEN, S. Y., & ÖZKURT, N. (2020, October). Convolutional neural network hyperparameter tuning with adam optimizer for ECG classification. In 2020 Innovations in Intelligent Systems and Applications Conference (ASYU) (pp. 1-6). IEEE. DOI:10.1109/ASYU50717.2020.9259896

He, X. S., He, F., & Cai, W. H. (2016). Underdetermined BSS based on K-means and AP clustering. Circuits, Systems, and Signal Processing, 35, 2881-2913. DOI:10.1007/s00034-015-0173-7

Detecting and Validating the Emotions in Alzheimer’s Patients by Voice Analysis, Computer Vision and Deep Learning

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE

Detecting and Validating the Emotions in Alzheimer’s Patients by Voice Analysis, Computer Vision and Deep Learning

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLESPUBLISHED

STUDENTAUTHORS

YEARSOF SERVICE

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE