Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal Jurnal Teknik Informatika (JUTIF)

Purwanti, Wahyu Noviani

Unknown Affiliation

Author-ID : 9966159

Computer Science & IT

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Evaluating SMOTE Performance for Imbalanced Multi-Label Sentiment Classification in MLSE Usability Testing of Mobile App Reviews Basri, Hasan; Purwanti, Wahyu Noviani; Alparisi, Ihsan
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 2 (2026): JUTIF Volume 7, Number 2, April 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.2.5351

Imbalanced data poses a significant challenge in multi-label classification tasks, especially when combining sentiment analysis with usability testing of mobile application reviews. This study investigates the effectiveness of the Synthetic Minority Over-sampling Technique (SMOTE) in improving classification performance on a multi-label dataset consisting of 10,000 Indonesian language user reviews from the Google Play store. The classification labels represent a combination of usability criteria and sentiment polarity, with strong imbalance observed across several classes. Three machine learning algorithms SVM, Decision Tree, and Random Forest were evaluated on datasets of increasing sizes (1,000 to 10,000 entries), each tested under both original and SMOTE-balanced conditions using stratified 10-fold cross-validation with accuracy and F1-score as the primary metrics. Experimental results show that SMOTE significantly improves the performance of Decision Tree mainly on smaller datasets but exhibits inconsistent gains as the dataset grows, provides modest and stable improvements for Random Forest, and negatively impacts SVM, whose performance remains consistently better without SMOTE. This study concludes that SMOTE is not a universally effective solution and must be applied selectively based on model characteristics. These findings contribute to the Machine Learning for Software Engineering (ML4SE) domain and the field of informatics by highlighting the importance of aligning resampling techniques with algorithmic behaviour when dealing with highly imbalanced multi-label text classification tasks.

Co-Authors Alparisi, Ihsan Hasan Basri

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search