Jurnal Informatika: Jurnal Pengembangan IT
Vol 10, No 4 (2025)

Comparison of IndoBERT and Bi-LSTM Models for Indonesian Law Violation Text Classification

Pramana, Made Wahyu Adwitya (Unknown)
Putri, Desy Purnami Singgih (Unknown)
Purnawan, I Ketut Adi (Unknown)



Article Info

Publish Date
15 Sep 2025

Abstract

Legal violations in Indonesia, particularly those under the Criminal Code (KUHP) and the Information and Electronic Transactions Law (UU ITE), are often difficult for the general public to interpret due to the complexity of legal language and article structures. This research aims to build a multilabel classification model that can automatically identify relevant legal articles from user-provided case descriptions. Two models were developed and compared: Bidirectional Long Short-Term Memory (Bi-LSTM) and IndoBERT. Using a manually labeled dataset, both models were evaluated through accuracy, F1-score, and Hamming Loss metrics, as well as 5-fold cross-validation. The results showed that IndoBERT outperformed Bi-LSTM with an average accuracy of 97% and a Hamming Loss of 0.027. However, t-test analysis revealed no statistically significant difference in F1-scores, indicating that both models have comparable effectiveness in capturing multiple labels. A confusion matrix analysis further identified patterns of misclassification in semantically similar articles. This study demonstrates the potential of NLP and deep learning to support legal awareness and provide the public with easier access to legal information.

Copyrights © 2025






Journal Info

Abbrev

informatika

Publisher

Subject

Computer Science & IT

Description

The scope encompasses the Informatics Engineering, Computer Engineering and information Systems., but not limited to, the following scope: 1. Information Systems Information management e-Government E-business and e-Commerce Spatial Information Systems Geographical Information Systems IT Governance ...