The exponential growth of e-commerce platforms has generated massive volumes of unstruc tured user reviews, necessitating advanced automated analysis methodologies to extract actionable insights for strategic decision-making. This study addresses multi-class text classi f ication challenges by integrating BERTopic-based topic modeling with ensemble learning algorithms to analyze Indonesian e-commerce reviews. A dataset comprising 24,000 customer reviews from Google Play Store underwent systematic preprocessing and topic extraction using BERTopic, yielding eight distinct thematic clusters reflecting application performance, product quality, pricing, delivery logistics, and service reliability. The dataset exhibited severe class imbalance with an imbalance ratio of 65:1, where the dominant class represented 76.02% of instances while minority classes constituted less than 2.12%. Hybrid resampling techniques combining undersampling and oversampling successfully reduced the imbalance ratio to 1.4:1. TF-IDF vectorization transformed preprocessed text into numerical features, followed by supervised classification using CatBoost and Extra Trees classifiers optimized through randomized hyperparameter search with stratified k fold cross-validation. CatBoost demonstrated superior performance, achieving balanced accuracy of 0.829, recall of 0.829, and AUC of 0.965, attributed to its ordered boosting mechanism and capacity for handling categorical and imbalanced data. Independent validation of 2025 data confirmed robust gen eralization with prediction confidence exceeding 0.90, revealing significant temporal evolution in which product-related topics emerged dominant at 70.35%, pricing concerns increased from 6.58% to 16.57%, while application issues decreased from 76.02% to 2.51%. This research establishes a methodologically rigorous framework integrating unsupervised topic discovery with supervised ensemble classification, demonstrating computational efficiency while providing scalable solutions for automated review categorization.
Copyrights © 2026