Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI)
Vol. 14 No. 2 (2025)

Document Matching for Contradiction Detection in Low-Resource Legislative Texts With Self-Training and Augmentation Using Transformer Model

Navastara, Dini Adni (Unknown)
Abdillah, Surya (Unknown)
Benito, Davian (Unknown)
Adillion, Ilham Gurat (Unknown)
Purwitasari, Diana (Unknown)



Article Info

Publish Date
14 Jul 2025

Abstract

Detecting contradictions within low-resource legislative texts presents significant challenges due to limited labeled data, complex legal language, and the vast number of verses contained within legal documents. These contradictions can lead to legal ambiguities and disputes if not addressed effectively. To tackle this problem, this study proposes a comprehensive system that combines document matching with contradiction detection. Legal documents are first clustered based on contextual similarity, enabling a more targeted analysis of potentially contradictory verses. Among several clustering approaches tested, keyword similarity-based clustering using KeyBERT produced the highest MatchingScore of 0.6111. To overcome the scarcity of labeled data, we employed a multi-step strategy involving manual annotation, generative AI-based data augmentation, and self-training techniques. The contradiction detection model was developed using the XLM-RoBERTa architecture, trained on TPU V2 with a batch size of 64. The model achieved strong performance, with 0.978 recall, 0.9356 precision, 0.982 accuracy, and a 0.9566 F1-score, completing each epoch in 82 seconds. This integrated approach significantly reduces the complexity of contradiction detection in legislative documents while ensuring high accuracy and robustness.

Copyrights © 2025






Journal Info

Abbrev

janapati

Publisher

Subject

Computer Science & IT Education Engineering

Description

Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI) is a collection of scientific articles in the field of Informatics / ICT Education widely and the field of Information Technology, published and managed by Jurusan Pendidikan Teknik Informatika, Fakultas Teknik dan Kejuruan, Universitas ...