Monetary Flood Damage Prediction Based On Machine Learning Models

Authors

  • Jaya Bijoor Stuyvesant High School
  • Shreyaa Raghavan

DOI:

https://doi.org/10.47611/jsrhs.v13i1.6588

Keywords:

Flood Prediction, Machine Learning, AI, Classification, Regression, Flood Damage

Abstract

Flooding has increased by 50%, and for the past 25 years, the United States has experienced a flooding event every 2 to 3 days [1]. The Federal Emergency Management Agency (FEMA) is the only resource people utilize to determine whether their homes are at risk for flooding. FEMA maps are costly, outdated, and lack predictive accuracy, leaving homeowners unaware and unprepared for potential dangers [2]. Additional tools are needed to ascertain accurate estimates of flood damages, but computational methods require rich datasets, which are difficult to obtain for flooding events [3]. We create a model that predicts monetary damages using various factors related to the flood event, such as duration, location, and cause. We utilize a flood dataset from the National Oceanic and Atmospheric Administration (NOAA) and data pre-processing techniques to handle missing values and remove irrelevant features [4]. We generate aggregate predictions by leveraging linear regression, random forest regression, XGBoost regression, and neural networks. Our findings show we can effectively use upsampling techniques to combat limited flood data. We show that floods with higher monetary damages are easier to predict, which is important because these floods inflict greater hardship on communities. The results improve preparedness for flood-related risks, property value assessments, and the accuracy of insurance policy underwriting [2]. Ultimately, the model provides a preliminary study on how individuals can make better-informed decisions and prepare for the impact of flooding in their communities. We hope this work encourages further machine learning applications to help prepare citizens for natural disasters.

Downloads

Download data is not yet available.

References or Bibliography

Flooding is America’s most frequent and expensive disaster. (n.d.). https://www.flooddefenders.org/problem

Dow Jones & Company. (2019, June 14). Is your home at risk of flooding? The data is hard to find. The Wall Street Journal. https://www.wsj.com/articles/is-your-home-at-risk-of-flooding-the-data-is-hard-to-find-11560418204

Wing, O. (2022, February 1). New Maps show U.S. flood damage rising 26 percent in next 30 years. Scientific American. https://www.scientificamerican.com/article/new-maps-show-us-flood-damage-rising-26-percent-in-next-30-years/

Li, Z. (2020). United States Flood Database (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4355693

Climate change indicators: Coastal flooding | US EPA. (n.d.). https://www.epa.gov/climate-indicators/climate-change-indicators-coastal-flooding

Guterres, A. (2020, July 31). Climate action and disaster risk reduction. UNDRR. https://www.undrr.org/climate-action-and-disaster-risk-reduction

Paerl, H. W., Hall, N. S., Hounshell, A. G., Luettich, R. A., Rossignol, K. L., Osburn, C. L., & Bales, J. (2019, July 23). Recent increase in catastrophic tropical cyclone flooding in coastal North Carolina, USA: Long-term observations suggest a regime shift. Scientific reports. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6650462/

World Health Organization. (n.d.). Climate change. World Health Organization. https://www.who.int/news-room/fact-sheets/detail/climate-change-and-health

Mosavi, A., Ozturk, P., & Chau, K. (2018, October 27). Flood prediction using Machine Learning Models: Literature review. MDPI. https://www.mdpi.com/2073-4441/10/11/1536

Wagenaar, D., Curran, A., Balbi, M., Bhardwaj, A., Soden, R., Hartato, E., Mestav Sarica, G., Ruangpan, L., Molinario, G., & Lallemant, D. (2020, April 29). Invited perspectives: How machine learning will change flood risk and impact assessment. Natural Hazards and Earth System Sciences. https://nhess.copernicus.org/articles/20/1149/2020/

Rahebeh Abedi, R., Bao Pham, Q., Shafizadeh-Moghadam, H., & Costache, R. (n.d.). Flash-flood susceptibility mapping based on XGBoost, Random Forest and Boosted Regression Trees. https://doi.org/10.1080/10106049.2021.1920636

Chen, A., You, S., Li, J., & Liu, H. (2021, November 2). The economic loss prediction of flooding based on machine learning and the input-output model. MDPI. https://www.mdpi.com/2073-4433/12/11/1448

Anwar, A. (2021, June 7). A Beginner's Guide to Regression Analysis in machine learning. Medium. https://towardsdatascience.com/a-beginners-guide-to-regression-analysis-in-machine-learning-8a828b491bbf

Deepanshi. (2023, July 20). All you need to know about your first machine learning model - linear regression. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2021/05/all-you-need-to-know-about-your-first-machine-learning-model-linear-regression/#:~:text=In%20the%20most%20simple%20words,the%20dependent%20and%20independent%20variable

Friedman, J. H. (2001, April 19). Greedy Function Approximation: A Gradient Boosting Machine. https://jerryfriedman.su.domains/ftp/trebst.pdf

1.1. Linear Models. scikit. (n.d.). https://scikit-learn.org/stable/modules/linear_model.html#ridge-regression-and-classification

Random Forest regression. Random Forest Regression GitBook. (n.d.). https://apple.github.io/turicreate/docs/userguide/supervised-learning/random_forest_regression.html

Chen, T., Guestrin, C. (2016, June 10). XGBoost: A scalable tree boosting system. arXiv.org. https://arxiv.org/abs/1603.02754

AL-Ma’amari, M. (2018, October 25). Deep neural networks for regression problems. Medium. https://towardsdatascience.com/deep-neural-networks-for-regression-problems-81321897ca33

(1959, November 1). Diagram of an artificial neural network. TeX. https://tex.stackexchange.com/questions/132444/diagram-of-an-artificial-neural-network

Zatout, C. (2023, February 6). A brief introduction to neural networks : A classification problem. Medium. https://towardsdatascience.com/a-brief-introduction-to-neural-networks-a-classification-problem-43e68c770081

Trotta, F. (2023, April 8). How and why performing one-hot encoding in your data science project. Medium. https://towardsdatascience.com/how-and-why-performing-one-hot-encoding-in-your-data-science-project-a1500ec72d85

Kim, M., & Hwang, K.-B. (2022, July 28). An empirical evaluation of sampling methods for the classification of Imbalanced Data. PloS one. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9333262/

Agarap, A. F. (2018, March). Deep learning using rectified linear units (ReLU) - researchgate. https://www.researchgate.net/publication/323956667_Deep_Learning_using_Rectified_Linear_Units_ReLU

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014, June). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. https://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

Werbos, P. J., Rumelhart, D. E., Hush, D. R., Chiang, C., & Lapedes, A. (1998, May 19). The generalized sigmoid activation function: Competitive supervised learning. Information Sciences. https://www.sciencedirect.com/science/article/abs/pii/S0020025596002009

S., R., Bharadwaj, A. S., K, D. S., Khadabadi, M. S., & Jayaprakash, A. (2023, March 7). Digital Implementation of the Softmax Activation Function and the Inverse Softmax Function. https://ieeexplore.ieee.org/document/10057747

Seif, G. (2022, February 11). Understanding the 3 most common loss functions for machine learning regression. Medium. https://towardsdatascience.com/understanding-the-3-most-common-loss-functions-for-machine-learning-regression-23e0ef3e14d3

Meyer, T. H. (2012, September). Root mean square error compared to, and contrasted with, standard deviation. https://www.researchgate.net/publication/263726816_Root_Mean_Square_Error_Compared_to_and_Contrasted_with_Standard_Deviation

Sharma, P. (2022, July 21). 4 proven tricks to improve your deep learning model’s performance. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2019/11/4-tricks-improve-deep-learning-model-performance/

Neuralthreads. (2021, December 26). Categorical cross-entropy loss - the most important loss function. Medium. https://neuralthreads.medium.com/categorical-cross-entropy-loss-the-most-important-loss-function-d3792151d05b

Why are floods hitting more places and people?. Environmental Defense Fund. (n.d.). https://www.edf.org/why-are-floods-hitting-more-places-and-people#:~:text=According%20to%20new%20research%20from,levels%20and%20more%20intense%20hurricanes

Hersher, R., & Kellman, R. (2020, October 20). Living in harm’s way: Why most flood risk is not disclosed. NPR. https://www.npr.org/2020/10/20/921132721/living-in-harms-way-why-most-flood-risk-is-not-disclosed

Published

02-29-2024

How to Cite

Bijoor, J., & Raghavan, S. (2024). Monetary Flood Damage Prediction Based On Machine Learning Models. Journal of Student Research, 13(1). https://doi.org/10.47611/jsrhs.v13i1.6588

Issue

Section

HS Research Projects