Machine Learning for Policy Guidance
DOI:
https://doi.org/10.47611/jsrhs.v11i3.3597Keywords:
Machine Learning, Artificial Intelligence, Preprocessing, Feature Selection, Model Selection, Model Interpretation, Linear Regression, Ridge Regression, Bayesian Ridge, Decision Tree, Random Forest, Supply-side Policy, Government Policy, EconomicsAbstract
This paper leverages machine learning algorithms and techniques to create models that can assist in a country's policy guidance. The machine learning process used to conduct research is discussed with steps such as preprocessing, feature selection, model selection, and model interpretation. Specifically, using datasets from the CIA's World Factbook and the United Nations' Human Development Index (HDI), machine learning models are created that use select features from several counties (e.g., real gross domestic product (GDP), population, and area). Then, the models make predictions on the countries' HDI scores. Model interpretation methods are used to find the most important features in predicting a country's score. This paper argues that important features can be derived through machine learning and guide government policy relevant to human development. Supply-side policies are discussed based on the results from the machine learning models. The use of machine learning with other indexes is also explored.
Downloads
References or Bibliography
1. Cross-validation: Evaluating estimator performance. (n.d.). [User Guide]. Scikit-Learn. Retrieved August 29, 2022, from https://scikit-learn/stable/modules/cross_validation.html
2. Permutation feature importance. (n.d.). [User Guide]. Scikit-Learn. Retrieved August 29, 2022, from https://scikit-learn/stable/modules/permutation_importance.html
Ardila, D., Kiraly, A. P., Bharadwaj, S., Choi, B., Reicher, J. J., Peng, L., Tse, D., Etemadi, M., Ye, W., Corrado, G., Naidich, D. P., & Shetty, S. (2019). End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine, 25(6), 954–961. https://doi.org/10.1038/s41591-019-0447-x
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Brownlee, J. (2020a, May 26). How to Scale Data With Outliers for Machine Learning. Machine Learning Mastery. https://machinelearningmastery.com/robust-scaler-transforms-for-machine-learning/
Brownlee, J. (2020b, July 30). How to Configure k-Fold Cross-Validation. Machine Learning Mastery. https://machinelearningmastery.com/how-to-configure-k-fold-cross-validation/
Central Intelligence Agency. (2022, August 18). The World Factbook. https://www.cia.gov/the-world-factbook/
Dhaduk, H. (2021, June 25). EDA | Exploratory Data Analysis With Python | What is EDA. https://www.analyticsvidhya.com/blog/2021/06/eda-exploratory-data-analysis-with-python/#h2_3
Elliott, T. (2019, January 24). The State of the Octoverse: Machine learning. The GitHub Blog. https://github.blog/2019-01-24-the-state-of-the-octoverse-machine-learning/
Gonfalonieri, A. (2019, May 17). 5 Ways to Deal with the Lack of Data in Machine Learning. KDnuggets. https://www.kdnuggets.com/5-ways-to-deal-with-the-lack-of-data-in-machine-learning.html/
Happy Planet Index – How happy Is the planet. (2021). https://happyplanetindex.org/
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., … Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2
Helliwell, J. F., Layard, R., Sachs, J. D., Neve, J.-E. D., Aknin, L. B., & Wang, S. (2022). World Happiness Report 2022. https://worldhappiness.report/ed/2022/
Kapoor, A., & Debroy, B. (2019, October 4). GDP Is Not a Measure of Human Well-Being. Harvard Business Review. https://hbr.org/2019/10/gdp-is-not-a-measure-of-human-well-being
Khanna, C. (2020, December 5). Multicollinearity—Why is it bad? Medium. https://towardsdatascience.com/multicollinearity-why-is-it-bad-5335030651bf
Koehrsen, W. (2018, April 20). Introduction to Bayesian Linear Regression. Medium. https://towardsdatascience.com/introduction-to-bayesian-linear-regression-e66e60791ea7
McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference, 56–61. https://doi.org/10.25080/Majora-92bf1922-00a
Ng, A. (2022). Machine Learning Specialization. Coursera. https://www.coursera.org/specializations/machine-learning-introduction
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85), 2825–2830.
Pettinger, T. (2017, October 8). The importance of supply-side policies. Economics Help. https://www.economicshelp.org/blog/31/supply-side/supply-side-policies/
Pettinger, T. (2019, October 30). Supply Side Policies. Economics Help. https://www.economicshelp.org/macroeconomics/economic-growth/supply-side-policies/
Radečić, D. (2022, March 21). Data Scaling for Machine Learning—The Essential Guide. Medium. https://towardsdatascience.com/data-scaling-for-machine-learning-the-essential-guide-d6cfda3e3d6b
Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and Statistical Modeling with Python. Proceedings of the 9th Python in Science Conference, 92–96. https://doi.org/10.25080/Majora-92bf1922-011
Singh Chauhan, N. (2022, February 9). Decision Tree Algorithm, Explained. KDnuggets. https://www.kdnuggets.com/decision-tree-algorithm-explained.html/
Swalin, A. (2018, July 10). Choosing the Right Metric for Evaluating Machine Learning Models—Part 1. USF-Data Science. https://medium.com/usf-msds/choosing-the-right-metric-for-machine-learning-models-part-1-a99d7d7414e4
Tavares, E. (2017, March 8). Variance Inflation Factor (VIF) Explained—Python. https://etav.github.io/python/vif_factor_python.html
United Nations. (2020). Human Development Report 2020. In Human Development Reports. United Nations. https://hdr.undp.org/content/human-development-report-2020
Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss, M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Agapiou, J. P., Jaderberg, M., … Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350–354. https://doi.org/10.1038/s41586-019-1724-z
Published
How to Cite
Issue
Section
Copyright (c) 2022 Justin Chae; Timothy Raines
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.