Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI)
Vol. 10 No. 3 (2024): September

From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News

Muhammad Yusuf Ridho (Universitas Indonesia)
Evi Yulianti (Universitas Indonesia)



Article Info

Publish Date
10 Sep 2024

Abstract

In the era of technology and information exchange online content being deceitful poses a serious threat to public trust and social harmony on a global scale. Detective mechanisms to identify content are essential for safeguard the populace effectively. This study is dedicated to creating a machine learning system that can automatically spot deceptive content in Indonesian language by utilizing IndoBERT. A model specifically tailored for the intricacies of the Indonesian language. IndoBERT was selected due to its capacity to grasp the linguistic nuances present, in Indonesian text which are often challenging for other models built upon the BERT framework. The key focus of this study lies in conducting an assessment of the IndoBERT model in relation to other approaches used in past research for identifying fake news like CNN LSTM and various classification models such as Logistic Regression and Naïve Bayes among others. To address the issue of imbalanced data between valid labels in fake news detection tasks we employed the SMOTE oversampling technique, for data augmentation and balancing purposes. The dataset employed consists of Indonesian language news articles publicly available and categorized as either hoax or valid following assessment by three judges voting system. IndoBERT Large demonstrated performance by achieving an accuracy rate of 98% outperform the original datasets 92% when tested on the oversampled dataset. Utilizing the SMOTE oversampling technique aided in data balance and enhancing the models performance. These outcomes highlight IndoBERTs capabilities in detecting fake news and pave the way for its potential integration, into real world scenarios.

Copyrights © 2024






Journal Info

Abbrev

JITEKI

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

JITEKI (Jurnal Ilmiah Teknik Elektro Komputer dan Informatika) is a peer-reviewed, scientific journal published by Universitas Ahmad Dahlan (UAD) in collaboration with Institute of Advanced Engineering and Science (IAES). The aim of this journal scope is 1) Control and Automation, 2) Electrical ...