Using Different Machine Learning Algorithms to Predict the Prices of Flight Tickets
DOI:
https://doi.org/10.47611/jsrhs.v12i4.5303Keywords:
Machine Learning, Artificial Intelligence, Regression, Airlines, Dynamic Pricing, Flight Tickets, Inflation, Linear Regression, Ridge Regression, DecisionTree, ML ModelsAbstract
The rising prices of flight tickets and the lack of transparency in the dynamic pricing strategies of airlines have caused many consumers to wonder, what factors actually determine these prices. In order to investigate this question, a large dataset of flight ticket bookings that includes the most price-defining variables was acquired. This data was preprocessed using discretization, normalization, and principal component analysis. This preprocessed data was then used to train 5 different Machine Learning algorithms: Linear Regression, DecisionTree, Ridge Regression, RandomForest, and SVR. The training of the RandomForest and SVR models was not possible due to runtime errors, however, the other models trained as expected. All models performed well, with the Linear Regression and Ridge Regression performing identically. Overall, the DecisionTree model performed the best at predicting the prices of flights, and by adjusting hyperparameters the performance could be further increased. The investigation could be continued by using a larger dataset to investigate how the model performs with more variables and under broader conditions. Additionally, the model could be reappropriated to make a user-friendly flight price prediction tool that helps consumers with their purchasing decisions.
Downloads
References or Bibliography
Allwright, Stephen. 2022. “MSE vs MAE, Which Is the Better Regression Metric?” Stephen Allwright. July 7, 2022. https://stephenallwright.com/mse-vs-mae/.
Bathwal, Shubham. n.d. “Flight Price Prediction.” Www.kaggle.com. https://www.kaggle.com/datasets/shubhambathwal/flight-price-prediction.
Castillo, Dianne. 2021. “Machine Learning Regression Explained.” Seldon. October 29, 2021. https://www.seldon.io/machine-learning-regression-explained#:~:text=Machine%20Learning%20Regression%20is%20a.
ChatGPT. 2023. “Response to ‘What Does It Mean, When My Linear Regression and Ridge Regression Model Perform the Exact Same Way?’” July 8, 2023. https://chat.openai.com.
“Decision Tree.” n.d. CORP-MIDS1 (MDS). Accessed July 8, 2023. https://www.mastersindatascience.org/learning/machine-learning-algorithms/decision-tree/#:~:text=A%20decision%20tree%20is%20a.
Fernando, Jason. 2021. “R-Squared Definition.” Investopedia. September 12, 2021. https://www.investopedia.com/terms/r/r-squared.asp.
Geisler Mesevage, Tobias. 2021. “What Is Data Preprocessing & What Are the Steps Involved?” MonkeyLearn Blog. May 24, 2021. https://monkeylearn.com/blog/data-preprocessing/.
Hayward, Justin, Daniel Martínez Garbuno, and Pranjal Pande. 2020. “How Airline Ticket Pricing Works.” Simple Flying. October 22, 2020. https://simpleflying.com/how-airline-ticket-pricing-works/#future-of-airline-pricing.
“How Linear Regression Algorithm Works—ArcGIS pro | Documentation.” n.d. Pro.arcgis.com. Accessed July 8, 2023. https://pro.arcgis.com/en/pro-app/latest/tool-reference/geoai/how-linear-regression-works.htm.
IBM. n.d. “What Is Supervised Learning? | IBM.” Www.ibm.com. Accessed June 8, 2023. https://www.ibm.com/topics/supervised-learning.
“Linear Regression in Machine Learning - Javatpoint.” n.d. Www.javatpoint.com. Accessed July 8, 2023. https://www.javatpoint.com/linear-regression-in-machine-learning.
Mark, Lois Alter. 2021. “This Is the Best Time to Buy Flights.” Reader’s Digest. December 6, 2021. https://www.rd.com/article/when-to-buy-plane-tickets/.
Numpy. 2009. “NumPy.” Numpy.org. 2009. https://numpy.org/.
Nyuytiymbiy, Kizito. 2022. “Parameters and Hyperparameters in Machine Learning and Deep Learning.” Medium. January 15, 2022. https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac#:~:text=Hyperparameters%20are%20parameters%20whose%20values.
Pandas. 2018. “Python Data Analysis Library — Pandas: Python Data Analysis Library.” Pydata.org. 2018. https://pandas.pydata.org/.
scikit-learn. 2019. “Scikit-Learn: Machine Learning in Python.” Scikit-Learn.org. 2019. https://scikit-learn.org/stable/.
“Sklearn.svm.SVR — Scikit-Learn 0.23.1 Documentation.” n.d. Scikit-Learn.org. Accessed July 9, 2023. https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html.
Willaert, Jorrit. 2021. “How to Calculate the Mean and Standard Deviation — Normalizing Datasets in Pytorch.” Medium. Towards Data Science. September 24, 2021. https://towardsdatascience.com/how-to-calculate-the-mean-and-standard-deviation-normalizing-datasets-in-pytorch-704bd7d05f4c#:~:text=The%20data%20can%20be%20normalized,channel%20is%20normalized%20this%20way.
Published
How to Cite
Issue
Section
Copyright (c) 2023 Jeremy Rohan Bollack; Joseph Anthony Vincent
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.