Bulletin of Social Informatics Theory and Application
Vol. 8 No. 2 (2024)

Comparison of K-Nearest Neighbor and Naïve Bayes algorithms for hoax classification in Indonesian health news

Pratomo, Awang Hendrianto (Unknown)
Rachmad, Faiz (Unknown)
Kodong, Frans Richard (Unknown)



Article Info

Publish Date
09 Dec 2024

Abstract

The categorization of health-related hoaxes is paramount in determining if they report facts. This paper analyzes the accuracy of the K-Nearest Neighbor (KNN) and the Naïve Bayes Classifier as two algorithms for health news hoaxes classification. Text mining was employed by feature extraction employing the TF-IDF method from the news headlines to classify the clusters. A prototype model was used to develop the system. Models assessment included confusion matrices and k-fold cross-validation. K=3 KNN model attained an average accuracy of 82.91%, precision of 85.3% and recall of 79.38% with no predictors included. The best performance was recorded for using the Naive Bayes model at fixation of K=3 KNN model at an average accuracy of 86.42%, precision level of 88.10% and recall high of 84.05%. These findings suggest that the KNN surfaces in the last model level rather than in the absence of the Naive Bayes model concerning classifying the hoax position of health news visible through the confusion evaluative matrix. Although related studies have been conducted in the past, this study is dissimilar in terms of its preprocessing methods, size of the data, and outcomes. The dataset consists of 1219 hoaxes labelled and 1227 facts labelled news headlines

Copyrights © 2024






Journal Info

Abbrev

businta

Publisher

Subject

Computer Science & IT Social Sciences

Description

Bulletin of Social Informatics Theory and Application (ISSN 2614-0047) is an interdisciplinary scientific journal for researchers from Computer Science, Informatics, Social Sciences, and Management Sciences to share ideas and opinions, and present original research work on studying the interplay ...