Legal violations in Indonesia, particularly those under the Criminal Code (KUHP) and the Information and Electronic Transactions Law (UU ITE), are often difficult for the general public to interpret due to the complexity of legal language and article structures. This research aims to build a multilabel classification model that can automatically identify relevant legal articles from user-provided case descriptions. Two models were developed and compared: Bidirectional Long Short-Term Memory (Bi-LSTM) and IndoBERT. Using a manually labeled dataset, both models were evaluated through accuracy, F1-score, and Hamming Loss metrics, as well as 5-fold cross-validation. The results showed that IndoBERT outperformed Bi-LSTM with an average accuracy of 97% and a Hamming Loss of 0.027. However, t-test analysis revealed no statistically significant difference in F1-scores, indicating that both models have comparable effectiveness in capturing multiple labels. A confusion matrix analysis further identified patterns of misclassification in semantically similar articles. This study demonstrates the potential of NLP and deep learning to support legal awareness and provide the public with easier access to legal information.
Copyrights © 2025