Benito, Davian
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Document Matching for Contradiction Detection in Low-Resource Legislative Texts With Self-Training and Augmentation Using Transformer Model Navastara, Dini Adni; Abdillah, Surya; Benito, Davian; Adillion, Ilham Gurat; Purwitasari, Diana
Jurnal Nasional Pendidikan Teknik Informatika: JANAPATI Vol. 14 No. 2 (2025)
Publisher : Prodi Pendidikan Teknik Informatika Universitas Pendidikan Ganesha

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.23887/janapati.v14i2.95954

Abstract

Detecting contradictions within low-resource legislative texts presents significant challenges due to limited labeled data, complex legal language, and the vast number of verses contained within legal documents. These contradictions can lead to legal ambiguities and disputes if not addressed effectively. To tackle this problem, this study proposes a comprehensive system that combines document matching with contradiction detection. Legal documents are first clustered based on contextual similarity, enabling a more targeted analysis of potentially contradictory verses. Among several clustering approaches tested, keyword similarity-based clustering using KeyBERT produced the highest MatchingScore of 0.6111. To overcome the scarcity of labeled data, we employed a multi-step strategy involving manual annotation, generative AI-based data augmentation, and self-training techniques. The contradiction detection model was developed using the XLM-RoBERTa architecture, trained on TPU V2 with a batch size of 64. The model achieved strong performance, with 0.978 recall, 0.9356 precision, 0.982 accuracy, and a 0.9566 F1-score, completing each epoch in 82 seconds. This integrated approach significantly reduces the complexity of contradiction detection in legislative documents while ensuring high accuracy and robustness.