Jurnal Riset Informatika
Vol. 7 No. 4 (2025): September 2025

Active Learning Query by Committee Labeling Method to Increase Accuracy and Efficiency of Sentiment Analysis Classification

Dipa Anasta Iskandar (Unknown)
R. Mohamad Atok (Unknown)



Article Info

Publish Date
12 Sep 2025

Abstract

This study proposes the Query by Committee (QBC) labeling method to improve the accuracy of classification models—specifically XLM-RoBERTa—and to increase labeling efficiency compared to manual, supervised labeling, which generally requires more time and resources. The dataset consists of unannotated healthcare-industry application reviews scraped from Google Play. Six distinct labeling strategies were applied as input for fine-tuning XLM-RoBERTa models under identical hyperparameter settings. The six labeling approaches were evaluated namely Rating-based labeling, Lexicon-based labeling, QBC for Rating-Vader labeling, QBC for Rating-Pseudo labeling, QBC for Vader-Pseudo labeling, and QBC triplet for Rating-Pseudo-Vader labeling. Each labeled dataset was split using stratified random sampling, and class weights were set to “auto” during training to address label imbalance. All models were subsequently tested on the IndoNLU SmSA test dataset, with performance compared in terms of accuracy, precision, recall, and F1-score. Results indicate that the triplet QBC approach (combining Rating, VADER, and Pseudo labeling) outperformed all other methods, achieving an accuracy of 91.4%, a precision of 91.28%, a recall of 91.4%, and an F1-score of 91.21%. These findings demonstrate that the QBC labeling method can serve as an effective and efficient alternative to manual annotation for similar classification tasks

Copyrights © 2025






Journal Info

Abbrev

jri

Publisher

Subject

Computer Science & IT

Description

Jurnal Riset Informatika, merupakan Jurnal yang diterbitkan oleh Kresnamedia Publisher. Jurnal Riset Informatika, berawal diperuntukan menampung paper-paper ilmiah yang dibuat oleh peneliti dan dosen-dosen program studi Sistem Informasi dan Teknik ...