Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal Jurnal Komputer Teknologi Informasi Sistem Komputer (JUKTISI)

Haris Setyo Pratomo

Universitas Bina Sarana Informatika

Author-ID : 10290665

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management Education Engineering

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Perbandingan Algoritma Machine Learning untuk Klasifikasi Hoaks Berbahasa Indonesia pada Dataset Komdigi Haris Setyo Pratomo; Panny Agustia Rahayuningsih; Muhammad Rezki
Jurnal Komputer Teknologi Informasi Sistem Komputer (JUKTISI) Vol. 5 No. 1 (2026): Juni 2026
Publisher : LKP KARYA PRIMA KURSUS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62712/juktisi.v5i1.1255

The spread of Indonesian-language hoaxes continues to increase along with the development of digital platforms, making it necessary to develop an automatic classification system capable of accurately and efficiently categorizing types of hoaxes. This study compares the performance of five machine learning algorithms, namely Support Vector Machine (SVM), Random Forest, Logistic Regression, Decision Tree, and Naive Bayes, in classifying Indonesian hoax categories using the Komdigi dataset consisting of 16,308 articles across six categories. Feature representation was performed using TF-IDF with n-gram combination (1,2) enriched with text statistical features, while the extreme class imbalance was handled using SMOTE applied internally within the Stratified K-Fold Cross-Validation pipeline to prevent data leakage. Evaluation results show that SVM (LinearSVC) achieved the highest accuracy of 95.9% and cross-validation score of 0.960, while Logistic Regression outperformed others in AUC Macro at 0.952 and macro F1-Score of 0.460, reflecting the best ability to recognize all categories in a balanced manner. Decision Tree showed the lowest performance with an AUC Macro of 0.635. These findings confirm that the selection of the best algorithm depends on the priority of evaluation metrics used according to the needs. This study contributes a recommendation of effective algorithms for Indonesian hoax classification and a valid, data leakage-free methodological framework.

Co-Authors Muhammad Rezki Panny Agustia Rahayuningsih

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search