The spread of false information has become a major challenge in Indonesian society, with 2,484 cases recorded in 2022. This highlights the importance of developing a system that can effectively identify and filter out fake news. This research aims to develop a more accurate fake news detection model by applying logistic regression, which is optimized by grid search and oversampling to overcome data imbalance. The main focus of this research is to improve the performance of the model in detecting fake news on unbalanced datasets. The dataset used is the Indonesian Fake News dataset, which consists of 4,231 entries with two categories: valid (3,465 entries) and hoax (766 entries). Preprocessing steps include stemming, stopword removal, and text normalization using TF-IDF. Random oversampling was applied to balance the data between hoax and valid classes, and parameter optimization was performed using grid search to improve model performance. The results show that the optimized logistic regression achieved the highest accuracy of 93%, surpassing naive bayes, which achieved 86% accuracy. These findings suggest that the developed fake news detection model can be used to improve the social media news monitoring system, and increase digital literacy among Indonesians.
Copyrights © 2025