Comparing the Effectiveness of Support Vector Classifier and Stochastic Gradient Descent in Hate-Speech Detection
Keywords:
Support Vector Classifier, Stochastic Gradient Descent, Hate-Speech DetectionAbstract
The increased use of Social Media with easy access to most people in the world has given rise to a multitude of problems; with cyberbullying and online hate-speech standing out as significant issues. With the choice of a user to maintain there anonymity and post most things that would be considered uncivil in a one-to-one real life conversation, has led to a widespread dissemination of online hate-speech, posing significant societal challenges and determinantal effects to an individual’s mental health. In this paper, we explored two simple Classifiers, Support Vector Classifier (SVC) and Stochastic Gradient Descent (SGD) which are compared and analysed through there accuracy score to determine there effectiveness in detecting hate-speech within the context of Twitter data. To train the models, a publicly available dataset by Analytics Vidhya which can be found on Kaggle.com is used which contains 32k tweets labelled with a ‘1’ if it is sexist/racist or ‘0’ if it’s not. The goal of this paper is identifying the differences in performances in hate-speech detection by the two classifiers.
References or Bibliography
Ml: Stochastic gradient descent (sgd). 2023.
Difference between batch gradient descent and stochastic gradient de-
scent. 2023.
[Alv17] Winter F Alvarez, A. Normative change and culture of hate: An
experiment in online environments. . European Sociological Review,
[Ban22] S. Bansal. A comprehensive guide to understand and implement text
classification in python. Analytics Vidhya, 2022.
[Bot18] Curtis F. E. Nocedal J Bottou, L. Optimization methods for large-scale
machine learning. arXiv.org, 2018.
[Dea12] Corrado G. S. Monga R. Chen K. Devin M. Le Q. V. Mao M. Z. Ran-
zato M. A. Senior A. Tucker P. Yang K. Ng A. Y Dean, J. Large
scale distributed deep networks - neurips. large scale distributed deep
networks. NeurIPS (Conference on Neural Information Processing Sys-
tems), 2012.
[Hui19] P Huilgol. Accuracy vs. f1-score. medium.com, 2019.
[Too] A. (n.d.) Toosi. Twitter sentiment analysis.
[Twi] Twitter. X’s policy on hateful conduct x help. twitter. rules-and-
policies/hateful-conduct-policy.
[Zha18] Z Zhang. Hate speech detection: A solved problem? the challenging
case of long tail on twitter. arXiv, 2018.
Downloads
Posted
License
Copyright (c) 2023 Dania Ali
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The copyright holder for this article has granted JSR.org a license to display the article in perpetuity.