How Can Machine Learning Determine Whether a Women's Tennis Player Will Make it to Top 100?
DOI:
https://doi.org/10.47611/jsrhs.v11i2.2847Keywords:
Age, AI, Bayes' Theorem, Factors, Height, Machine Learning, Naïve Bayes, Nationality, Pearson’s Correlation Coefficient, Ranking, Tennis, Top 100Abstract
There are a lot of speculations within and outside of the tennis community about whether factors like height, age, and nationality play a role in the success of a tennis player. For this study, ‘success’ is defined as making it to the Top 100 ranked list. There have been studies in the past associating height of a tennis player with success, but this has primarily been done for men’s tennis players. In this study, we not only establish the relation between height and success of women tennis players but also consider two additional factors: age and nationality. We also mathematically conclude using Pearson’s correlation coefficient whether there is any statistical correlation between these three factors and success. Once we establish the relationship, we develop an AI model to predict future successful players based on historical tennis data. Since some of the earlier studies have already considered height as one of the success factors, our machine learning model uses Naïve Bayes’ to determine the probability of success using all three factors to predict success with an accuracy of 0.67 for dataset used. The individual Pearson correlation coefficients for height and age with success, demonstrating the applicability of factors in identifying a player’s potential for success are 0.23 and 0.19 respectively. Further research can be conducted by using more factors or larger dataset and could foster greater understanding of female success in tennis.
Keywords: Age, AI, Bayes’ Theorem, Factors, Height, Machine learning, Naïve Bayes, Nationality, Pearson’s Correlation Coefficient, Ranking, Tennis, Top 100
Downloads
References or Bibliography
Bosscher, V. D., Knop, P. D. and Heyndels, P. (2004). Comparing tennis success among countries. JCMS Journal of Common Market studies, 25(1), 49-68, Retrieved from https://www.researchgate.net/ publication/239844205_Comparing_Tennis_Success_Among_Countries
Burns, E. (2021). Retrieved from https://www.techtarget.com/ searchenterpriseai/definition/machine-learning-ML
Gallo-Salazar, C., Salinero, J. J., Sanz, D., Areces, F. and Coso, J. D. (2015). Professional tennis is getting older: Age for the top 100 ranked tennis players. International Journal of Performance Analysis in Sport, 15(3), Retrieved from https://doi.org/10.1080/24748668.2015.11868837 doi: doi.org/10.1080/24748668.2015.11868837
Glen, S. (n.d). Correlation Coefficient: Simple Definition, Formula, Easy Steps. Statistics How To, Retrieved from https://www.statisticshowto.com/probability-and-statistics/correlation-coefficient-formula/
Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in science & amp; engineering, 9(3), 90-95, Retrieved from https://ui.adsabs.harvard.edu/link_gateway/2007CSE.....9...90H/ doi:10.1109/MCSE.2007.55 DOI: doi:10.1109/MCSE.2007.55
Joyce, J. (2008). Bayes’ Theorem.,. Retrieved from https://philpapers.org/ rec/JOYBT
Li, P., Weissensteiner, J. R., Pion, J. and Bosscher, V. D. (2020). Predicting elite success: Evidence comparing the career pathways of top 10 to 300 professional tennis players. International Journal of Sports Science & Coaching, 15(5-6). DOI: doi:10.1177/1747954120935828
Mckinney, W. (2010). Data structures for statistical computing in python. Proceedings of the 9th Python in Science Conference, 51-56, Retrieved from 10.25080/Majora-92bf1922-00a
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(October), 2825-2830, Retrieved from https://www.jmlr.org/papers/volume12/pedregosa11a/ pedregosa11a.pdf
Ramamonjisoa, S. (2020). How height matters in professional tennis?. Retrieved from https://www.siskoramamonjisoa.com/post/how-height-matters-in-professional-tennis
Scikit-Learn. (2007). 1.9. Naive Bayes.,. Retrieved from https://scikit-learn.org/stable/modules/naive_bayes.html
Sharma, P. (2021). Implementation of Gaussian Naïve Bayes in Python Sklearn.,. Retrieved from https://www.analyticsvidhya.com/blog/2021/ 11/implementation-of-gaussian-naive-bayes-in-python-sklearn/
Sipko, M. (2015). Machine Learning for the Prediction of Professional Tennis Matches. , Retrieved from http://www.doc.ic.ac.uk/teaching/ distinguished-projects/2015/m.sipko.pdf
Virtanen, P., Gommers, R. and Oliphant, T. E. (2020). SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat Methods, 17, 261-272. DOI: https://doi.org/10.1038/s41592-019-0686-2
Waskom, M., Botvinnik, O., O'kane, D., Hobson, P., Lukauskas, S., Gemperline, D. C., Augspurger, T., Halchenko, Y., Cole, J. B., Warmenhoven, J., Ruiter, J. D., Pye, C., Hoyer, S., Vanderplas, J., Villalba, S., Kunter, G., Quintero, E., Bachant, P., Martin, M., … Qalieh, A. (2017). Seaborn: Statistical Data Visualization. Journal of Open Source Software, 6(60), Retrieved from https://doi.org/10.5281/ zenodo.883859 doi: doi:10.21105/joss.03021
Women’s Tennis Association (n.d), available: https://www.wtatennis.com/ [Accessed 21 Oct 2021]
Women’s Tennis Association (n.d), Active WTA Players, available: https://www.wtatennis.com/players
Wood, R. (2016) Height of Wimbledon Players Over Time, Topend Sports Website, available: https://www.topendsports.com/sport/tennis/anthropometry-wimbledon.htm
Published
How to Cite
Issue
Section
Copyright (c) 2022 Saina Deshpande; Vanessa Klotzman
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.