Using Machine Learning Regressors for the Discovery of Culex Mosquito Habitats and Breeding Patterns in Washington D.C.
DOI:
https://doi.org/10.47611/jsrhs.v11i4.3710Keywords:
mosquito breeding patterns, machine learning techniques, Culex mosquitoes, environmental variablesAbstract
Culex mosquitoes pose a significant threat to humans and other species due to their ability to carry deadly viruses such as the West Nile and Zika. Washington D.C., in particular, has a humid subtropical climate that is ideal as a habitat for mosquito breeding. Thus, tracking mosquitoes’ habitats and breeding patterns in Washington D.C. is crucial for addressing local public health concerns. Although fieldwork techniques have improved over the years, monitoring and analyzing mosquitoes is difficult, dangerous, and time-consuming. In this work, we propose a solution by creating a Culex mosquito abundance predictor using machine learning techniques to determine under which conditions Culex mosquitoes thrive and reproduce. We used four environmental variables to conduct this experiment: precipitation, specific humidity, enhanced vegetation index (EVI), and surface skin temperature. We obtained sample data of these variables in the Washington D.C. areas from the NASA Giovanni Earth Science Data system, as well as mosquito abundance data collected by the D.C. government. Using these data, we created and compared four machine learning regression models: Random Forest, Decision Tree, Support Vector Machine, and Multi-Layer Perceptron. We searched for the optimal configurations for each model to get the best fitting possible. Random Forest Regressor produced the most accurate prediction of mosquito abundance in an area with the four environment variables, achieving a mean average error of 3.3. EVI was the most significant factor in determining mosquito abundance. Models and findings from this research can be utilized by public health programs for mosquito-related disease observations and predictions.
Downloads
References or Bibliography
Basak, D., & Pal, S. (2007). Support Vector Regression. Statistics and Computing, 11(10), 203–224.
Celestin, M. N., & Musteata, F. M. (2021). Impact of Changes in Free Concentrations and Drug-Protein Binding on Drug Dosing Regimens in Special Populations and Disease States. Journal of pharmaceutical sciences, 110(10), 3331–3344. https://doi.org/10.1016/j.xphs.2021.05.018.
Centers for Disease Control and Prevention. (2021, December 17). Final Cumulative Maps and Data. Centers for Disease Control and Prevention. Retrieved July 22, 2022, from https://www.cdc.gov/westnile/statsmaps/cumMapsData.html.
Centers for Disease Control and Prevention. (2020, December 7). Prevention. Centers for Disease Control and Prevention. Retrieved July 22, 2022, from https://www.cdc.gov/westnile/prevention/index.html
Chai, T., & Draxler, R. R. (2014). Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)? – Arguments against Avoiding RMSE in the Literature. Geoscientific Model Development, 7(3), 1247–1250. https://doi.org/10.5194/gmd-7-1247-2014.
Environmental Protection Agency. (n.d.). EPA. Retrieved July 22, 2022, from https://www.epa.gov/mosquitocontrol/mosquito-life-cycle.
Francisco, M. E., Carvajal, T. M., Ryo, M., Nukazawa, K., Amalin, D. M., & Watanabe, K. (2021). Dengue Disease Dynamics Are Modulated by the Combined Influences of Precipitation and Landscape: A Machine Learning Approach. Science of The Total Environment, 792, 148406. https://doi.org/10.1016/j.scitotenv.2021.148406.
GLOBE, Globe Data User Guide. (n.d.). Retrieved July 23, 2022, from https://www.globe.gov/documents/10157/2592674/GLOBE+Data+User+Guide_v1_final.pdf/863a971d-95c5-4dd9-b75c-46713f019088.
Loh, W. Y. (2011). Classification and Regression Trees. WIREs Data Mining and Knowledge Discovery, 1(1), 14–23. https://doi.org/10.1002/widm.8.
Murtagh, F. (1991). Multilayer Perceptrons for Classification and Regression. Neurocomputing, 2(5-6), 183–197. https://doi.org/10.1016/0925-2312(91)90023-5.
NASA. (n.d.). Giovanni. NASA. Retrieved July 22, 2022, from https://giovanni.gsfc.nasa.gov/giovanni/.
National Centers for Environmental Information. (2022, May 11). Washington D.C. Precipitation. Retrieved July 27, 2022, from https://www.weather.gov/media/lwx/climate/dcaprecip.pdf.
Open Data DC. (2021, December 8), Mosquito Trap Sites. Retrieved July 22, 2022, from https://opendata.dc.gov/datasets/DCGIS::mosquito-trap-sites/about.
Schneider, J., Greco, A., Chang, J., Molchanova, M., & Shao, L. (2021). Predicting West Nile Virus Mosquito Positivity Rates and Abundance: A Comparative Evaluation of Machine Learning Methods for Epidemiological Applications. https://doi.org/10.1002/essoar.10509422.1.
Schonlau, M., & Zou, R. Y. (2020). The Random Forest Algorithm for Statistical Learning. The Stata Journal: Promoting Communications on Statistics and Stata, 20(1), 3–29. https://doi.org/10.1177/1536867x20909688.
Soh, S., & Aik, J. (2021). The Abundance of Culex Mosquito Vectors for West Nile Virus and Other Flaviviruses: A Time-series Analysis of Rainfall and Temperature Dependence in Singapore. Science of The Total Environment, 754, 142420. https://doi.org/10.1016/j.scitotenv.2020.142420.
Washington, D.C. Topographic Map, Elevation, Relief. Topographic. (n.d.). Retrieved July 22, 2022, from https://en-nz.topographic-map.com/maps/sqll/Washington-D-C/.
West Nile Virus. West Nile Virus. (n.d.). Retrieved July 22, 2022, from https://dchealth.dc.gov/service/west-nile-virus.
Published
How to Cite
Issue
Section
Copyright (c) 2022 Iona Xia, Neha Singirikonda, Landon Hellman, Jasmine Watson, Marvel Hanna; Dr. Russanne Low
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.