Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Informatics, Electrical and Electronics Engineering

Comparative Analysis of Machine Learning Models for Identifying Cybercrimes in Social Media Comments Fauzan, Abd. Charis; Arifin, Mochammad; Mafula, Veradella Yuelisa
Jurnal Teknik Elektro dan Informatika Vol 4 No 2 (2024): INFOTRON
Publisher : Universitas Islam Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33474/infotron.v4i2.23069

Abstract

The rapid growth of social media has created opportunities for digital interaction but has also introduced challenges, particularly in addressing cybercrimes such as defamation, threats, and SARA-related content. Cybercrime detection on social media is critical as it helps mitigate the spread of harmful behavior, safeguard users, and support law enforcement in addressing violations like Indonesia's Information and Electronic Transactions Law (UU ITE). This study conducts a comparative analysis of machine learning algorithms—Naive Bayes, Support Vector Machines (SVM), and Random Forests—to identify cybercrimes in social media comments. Using a sentiment-labeled dataset obtained from Kaggle, consisting of Indonesian social media comments from Twitter (X), the comments are categorized into seven specific classes: Neutral Sentiment, Positive Sentiment, Negative Sentiment, Insulting Government, Insulting or Defaming Others, Threatening Others, and SARA-Based Content. The results show that Random Forest achieved the highest overall accuracy (91%) and performed best in detecting moderately represented classes such as Insulting Government. SVM demonstrated robust performance with 88% accuracy, particularly excelling in identifying dominant classes like Negative Sentiment, while Naive Bayes, though computationally efficient, struggled with minority classes, achieving an accuracy of 73%. However, the dataset's imbalance posed challenges for all algorithms, particularly with underrepresented categories. This limitation underscores the need for more diverse and representative datasets to improve model performance and ensure broader applicability of the findings.