Detecting Fake News Using Machine Learning
DOI:
https://doi.org/10.47611/jsrhs.v12i1.3940Keywords:
Fake News, Artificial Intelligence, Machine Learning, Linear SVC, Natural Language ProcessingAbstract
Fake news has had a significant effect on society and politics. To aid in combating the spread of misinformation, we worked to develop a machine learning algorithm that could detect fake news based on textual data. We used a count vectorizer to vectorize our text which we then inputted into Logistic Regression, Support Vector Machine (SVM), and Linear Support Vector Classifier (SVC) models. The greatest accuracy score achieved was 99.97% with the Linear SVC. We discovered however that there was a significant difference in how the real and fake news datasets were constructed that would not translate into real life: the true news articles contained quotation marks, apostrophes, and dashes while these characters were not present in the fake news articles. Because of this, we also developed a more applicable Logistic Regression model removing these specific characters from the dataset all together with an accuracy score of 98.4%.
Downloads
References or Bibliography
Ahmed, H., Traore, I., Saad, S. “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques.” https://www.uvic.ca/ecs/ece/isot/assets/docs/Detection%20of%20Online%20Fake%20News%20Using%20N-Gram.pdf?utm_medium=redirect&utm_source=/engineering/ece/isot/assets/docs/Detection%20of%20Online%20Fake%20News%20Using%20N-Gram.pdf&utm_campaign=redirect-usage.
Allcott, H., Gentzkow, M. “Social Media and Fake News in the 2016 Election.” https://web.stanford.edu/~gentzkow/research/fakenews.pdf
Aphiwongsophon, S., Chongstitvatana, P. “Detecting Fake News with Machine Learning Method.” https://d1wqtxts1xzle7.cloudfront.net/59012493/Detecting-Fake-News-submit20190424-97672-6rhwzo-with-cover-page-v2.pdf?Expires=1656512062&Signature=YDwzCpIwNJzqCViEMp~j-OvEYgY11u-F-49RDTNiph9OQ10xTgEnHPPpUXmo2I6d3sO2isDxeyvn5QmLVZB-CalmLpmwsPOhOCsFjR06~VwlgW8nyg94t-49T91wErp0FNKhdEJJaGFkbMlG28Qup419mRYO-6cBQIkLqSRLyc3pEEsnx1XP-Wp19UqW~RySlW0EyGeMtyZ5dxQnxn-zQZ56FiaNo26dmaiHmmSzlizS3bHW7d70DuFqXxPCNt~oijx~HpKHgwsZtGnUud4mPvekbnZdE-yUHyKT04jWhPXBwFHHeElUB5srMQD38Rs-KTr49WhKXKDwksW8ueiwSg__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA.
Dizikes, P. “Study: On Twitter, false news travels faster than true stories.” MIT, 8 Mar. 2018, https://news.mit.edu/2018/study-twitter-false-news-travels-faster-true-stories-0308.
Dreisbach, T. “How Trump's 'will be wild!' tweet drew rioters to the Capitol on Jan. 6.” NPR, 13 Jul. 2022, https://www.npr.org/2022/07/13/1111341161/how-trumps-will-be-wild-tweet-drew-rioters-to-the-capitol-on-jan-6.
Joachims, T. “Text Categorization with Support Vector Machines: Learning with Many Relevant Features.” https://www.cs.cornell.edu/people/tj/publications/joachims_98a.pdf.
Khanna, C. “Text Pre-Processing: Stop Words Removal Using Different Libraries.” Towards Data Science, 10 Feb. 2021, https://towardsdatascience.com/text-pre-processing-stop-words-removal-using-different-libraries-f20bac19929a#:~:text=Stop%20words%20are%20available%20in,focus%20to%20the%20important%20information.
Published
How to Cite
Issue
Section
Copyright (c) 2023 Elsa Norman
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright holder(s) granted JSR a perpetual, non-exclusive license to distriute & display this article.