Tandi, Teuku Yusransyah
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Incorporation of IndoBERT and Machine Learning Features to Improve the Performance of Indonesian Textual Entailment Recognition Tandi, Teuku Yusransyah; Abidin, Taufik Fuadi; Riza, Hammam
Journal of Information Systems Engineering and Business Intelligence Vol. 11 No. 2 (2025): June
Publisher : Universitas Airlangga

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20473/jisebi.11.2.173-186

Abstract

Background: Recognizing Textual Entailment (RTE) is a task in Natural Language Processing (NLP), used for question-answering, information retrieval, and fact-checking. The problem faced by Indonesian NLP is based on how to build an effective and computationally efficient RTE model. In line with the discussion, deep learning models such as IndoBERT-large-p1 can obtain high F1-score values but require large GPU memory and very long training times, making it difficult to apply in environments with limited computing resources. On the other hand, machine learning method requires less computing power and provide lower performance. The lack of good datasets in Indonesian is also a problem in RTE study.  Objective: This study aimed to develop Indonesian RTE model called Hybrid-IndoBERT-RTE, which can improve the F1-Score while significantly increasing computational efficiency.  Methods: This study used the Wiki Revisions Edits Textual Entailment (WRETE) dataset consisting of 450 data, 300 for training, 50 for validation, and 100 for testing, respectively. During the process, the output vector generated by IndoBERT-large-p1 was combined with feature-rich classifier that allowed the model to capture more important features to enrich the information obtained. The classification head consisted of 1 input, 3 hidden, and 1 output layer.  Results: Hybrid-IndoBERT-RTE had an F1-score of 85% and consumed 4.2 times less GPU VRAM. Its training time was up to 44.44 times more efficient than IndoBERT-large-p1, showing an increase in efficiency.  Conclusion: Hybrid-IndoBERT-RTE improved the F1-score and computational efficiency for Indonesian RTE task. These results showed that the proposed model had achieved the aims of the study. Future studies would be expected to focus on adding and increasing the variety of datasets.  Keywords: Textual Entailment, IndoBERT-large-p1, Feature-rich classifiers, Hybrid-IndoBERT-RTE, Deep learning, Model efficiency