How Can Machine Learning Determine Whether a Women's Tennis Player Will Make it to Top 100?

Authors

  • Saina Deshpande Vikhe Patil Memorial School
  • Vanessa Klotzman University of California, Irvine

DOI:

https://doi.org/10.47611/jsrhs.v11i2.2847

Keywords:

Age, AI, Bayes' Theorem, Factors, Height, Machine Learning, Naïve Bayes, Nationality, Pearson’s Correlation Coefficient, Ranking, Tennis, Top 100

Abstract

There are a  lot of speculations within and outside of the tennis community about whether factors like height, age, and nationality play a role in the success of a tennis player. For this study, ‘success’ is defined as making it to the Top 100 ranked list. There have been studies in the past associating height of a tennis player with success, but this has primarily been done for men’s tennis players. In this study, we not only establish the relation between height and success of women tennis players but also consider two additional factors: age and nationality. We also mathematically conclude using Pearson’s correlation coefficient whether there is any statistical correlation between these three factors and success. Once we establish the relationship, we develop an AI model to predict future successful players based on historical tennis data. Since some of the earlier studies have already considered height as one of the success factors, our machine learning model uses Naïve Bayes’ to determine the probability of success using all three factors to predict success with an accuracy of 0.67 for dataset used. The individual Pearson correlation coefficients for height and age with success, demonstrating the applicability of factors in identifying a player’s potential for success are 0.23 and 0.19 respectively. Further research can be conducted by using more factors or larger dataset and could foster greater understanding of female success in tennis.

Keywords: Age, AI, Bayes’ Theorem, Factors, Height, Machine learning, Naïve Bayes, Nationality, Pearson’s Correlation Coefficient, Ranking, Tennis, Top 100

Downloads

Download data is not yet available.

Author Biography

Vanessa Klotzman, University of California, Irvine

Advisor

References or Bibliography

Bosscher, V. D., Knop, P. D. and Heyndels, P. (2004). Comparing tennis success among countries. JCMS Journal of Common Market studies, 25(1), 49-68, Retrieved from https://www.researchgate.net/ publication/239844205_Comparing_Tennis_Success_Among_Countries

Burns, E. (2021). Retrieved from https://www.techtarget.com/ searchenterpriseai/definition/machine-learning-ML

Gallo-Salazar, C., Salinero, J. J., Sanz, D., Areces, F. and Coso, J. D. (2015). Professional tennis is getting older: Age for the top 100 ranked tennis players. International Journal of Performance Analysis in Sport, 15(3), Retrieved from https://doi.org/10.1080/24748668.2015.11868837 doi: doi.org/10.1080/24748668.2015.11868837

Glen, S. (n.d). Correlation Coefficient: Simple Definition, Formula, Easy Steps. Statistics How To, Retrieved from https://www.statisticshowto.com/probability-and-statistics/correlation-coefficient-formula/

Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in science & amp; engineering, 9(3), 90-95, Retrieved from https://ui.adsabs.harvard.edu/link_gateway/2007CSE.....9...90H/ doi:10.1109/MCSE.2007.55 DOI: doi:10.1109/MCSE.2007.55

Joyce, J. (2008). Bayes’ Theorem.,. Retrieved from https://philpapers.org/ rec/JOYBT

Li, P., Weissensteiner, J. R., Pion, J. and Bosscher, V. D. (2020). Predicting elite success: Evidence comparing the career pathways of top 10 to 300 professional tennis players. International Journal of Sports Science & Coaching, 15(5-6). DOI: doi:10.1177/1747954120935828

Mckinney, W. (2010). Data structures for statistical computing in python. Proceedings of the 9th Python in Science Conference, 51-56, Retrieved from 10.25080/Majora-92bf1922-00a

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(October), 2825-2830, Retrieved from https://www.jmlr.org/papers/volume12/pedregosa11a/ pedregosa11a.pdf

Ramamonjisoa, S. (2020). How height matters in professional tennis?. Retrieved from https://www.siskoramamonjisoa.com/post/how-height-matters-in-professional-tennis

Scikit-Learn. (2007). 1.9. Naive Bayes.,. Retrieved from https://scikit-learn.org/stable/modules/naive_bayes.html

Sharma, P. (2021). Implementation of Gaussian Naïve Bayes in Python Sklearn.,. Retrieved from https://www.analyticsvidhya.com/blog/2021/ 11/implementation-of-gaussian-naive-bayes-in-python-sklearn/

Sipko, M. (2015). Machine Learning for the Prediction of Professional Tennis Matches. , Retrieved from http://www.doc.ic.ac.uk/teaching/ distinguished-projects/2015/m.sipko.pdf

Virtanen, P., Gommers, R. and Oliphant, T. E. (2020). SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat Methods, 17, 261-272. DOI: https://doi.org/10.1038/s41592-019-0686-2

Waskom, M., Botvinnik, O., O'kane, D., Hobson, P., Lukauskas, S., Gemperline, D. C., Augspurger, T., Halchenko, Y., Cole, J. B., Warmenhoven, J., Ruiter, J. D., Pye, C., Hoyer, S., Vanderplas, J., Villalba, S., Kunter, G., Quintero, E., Bachant, P., Martin, M., … Qalieh, A. (2017). Seaborn: Statistical Data Visualization. Journal of Open Source Software, 6(60), Retrieved from https://doi.org/10.5281/ zenodo.883859 doi: doi:10.21105/joss.03021

Women’s Tennis Association (n.d), available: https://www.wtatennis.com/ [Accessed 21 Oct 2021]

Women’s Tennis Association (n.d), Active WTA Players, available: https://www.wtatennis.com/players

Wood, R. (2016) Height of Wimbledon Players Over Time, Topend Sports Website, available: https://www.topendsports.com/sport/tennis/anthropometry-wimbledon.htm

Published

05-31-2022

How to Cite

Deshpande, S., & Klotzman, V. (2022). How Can Machine Learning Determine Whether a Women’s Tennis Player Will Make it to Top 100?. Journal of Student Research, 11(2). https://doi.org/10.47611/jsrhs.v11i2.2847

Issue

Section

HS Research Projects