Journal Sensi: Strategic of Education in Information System
Vol 11 No 2 (2025): Journal SENSI

Logistic Regression and TF-IDF Models for Early Detection of Indonesian-Language Fake News

Arribathi, Abdul Hamid (Unknown)
Jungjunan, Sunan Reihan (Unknown)
Dwiharyanto, Muhammad Adrian (Unknown)



Article Info

Publish Date
31 Aug 2025

Abstract

The spread of false information, particularly on social media platforms, has become a significant challenge due to the rapid flow of digital content. Most existing fake news detection systems are primarily designed for the English language, making them less effective when applied to Indonesian contexts. This study proposes a web-based hoax detection system that combines the Term Frequency-Inverse Document Frequency (TF-IDF) method with the Logistic Regression classification algorithm. The dataset, consisting of both real and fake news articles, was obtained from Kaggle and processed through several stages including normalization, stopword removal, and stemming. The TF-IDF vectorization results were then used to train a binary classification model. The system allows for user input either in the form of raw text or a URL, and delivers real-time classification results. Evaluation of the system indicates a high level of accuracy and strong potential in improving public digital literacy. These findings demonstrate a lightweight yet effective approach to mitigating the spread of hoaxes in the Indonesian language.

Copyrights © 2025






Journal Info

Abbrev

sensi

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

Riset Soft Computing dengan penelitian dari yang berfokus pada Data Mining, Neural Network, Swarm Intelligence, Decision Tree, Data Clustering, Data Classification, Rough Set, Pattern Recognition, Image Processing. Software Engineering yang fokus pada software Requirement and Specification, Software ...