The spread of hoax news on social media causes social unrest and economic losses. This study builds a classification model for Indonesian hoax news using Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine (SVM). The dataset consists of 970 news from TurnBackHoax.id with FALSE and FRAUD categories. The research includes text preprocessing, TF-IDF feature extraction with unigram and bigram, and linear kernel SVM classification. Data was split 80:20 using stratified sampling with parameter optimization through Grid Search and 5-fold Cross Validation. Evaluation results show the model classifies hoax news with good performance based on accuracy, precision, recall, and f1-score metrics. The confusion matrix indicates most data was correctly classified despite errors in news with overlapping linguistic patterns. The study proves TF-IDF and SVM combination is effective for Indonesian hoax detection with low computational requirements. Further development is recommended using larger datasets and comparing with deep learning methods.
Copyrights © 2026