Predicting Running Injuries with Classification Machine Learning Models
DOI:
https://doi.org/10.47611/jsrhs.v12i1.4046Keywords:
F-beta score, Random Forest Classifier, Logistic Regression, classification, running injury, Hyperparameter Tuning, imbalanced datasetAbstract
Can running injuries be predicted using only a dataset and machine learning models? This paper explores this question using classification models, including the Logistic Regression model and the Random Forest Classifier model. In the dataset used, ten features were taken into account when predicting running injuries. With slight modifications, the Weighted Logistic Regression and over and down-sampling Random Forest Classifier models were used to mitigate the imbalance in the dataset. The results suggested that the best model was Weighted Logistic Regression and that the best score metric to consider was the F-beta score.
Downloads
References or Bibliography
Lovdal, S., den Hartigh, R., & Azzopardi, G. (2021). Injury Prediction in Competitive Runners with Machine Learning. International Journal of Sports Physiology and Performance, 16(10), 1522–1531. https://doi.org/10.1123/ijspp.2020-0518
Chmait, N., & Westerbeek, H. (2021). Artificial Intelligence and Machine Learning in Sport Research: An Introduction for Non-data Scientists. Frontiers in Sports and Active Living, 3. https://www.frontiersin.org/articles/10.3389/fspor.2021.682287
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
F-beta score. (n.d.). Hasty.Ai. Retrieved September 29, 2022, from https://hasty.ai/docs/mp-wiki/metrics/f-beta-score
Iyer, S. R., & Sharda, R. (2009). Prediction of athletes performance using neural networks: An application in cricket team selection. Expert Systems with Applications, 36(3, Part 1), 5510–5522. https://doi.org/10.1016/j.eswa.2008.06.088
Maalouf, M., & Siddiqi, M. (2014). Weighted logistic regression for large-scale imbalanced and rare events data. Knowledge-Based Systems, 59, 142–148. https://doi.org/10.1016/j.knosys.2014.01.012
Lovdal, S., den Hartigh, R., & Azzopardi, G. (2021). Replication Data for: Injury Prediction In Competitive Runners With Machine Learning. DataverseNL. https://doi.org/10.34894/UWU9PV
Published
How to Cite
Issue
Section
Copyright (c) 2023 Elgin Vuong; Joseph Vincent
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.