TIERS Information Technology Journal
Vol. 6 No. 2 (2025)

Optimization CatBoost using GridSearchCV for Sentiment Analysis Customer Reviews in Digital Transportation Industry

Ifriza, Yahya Nur (Unknown)
Sanusi, Ratna Nur Mustika (Unknown)
Febriyanto, Hendra (Unknown)
Kamaruddin, Azlina (Unknown)



Article Info

Publish Date
30 Dec 2025

Abstract

The rapid expansion of ride-hailing services has generated a massive volume of user feedback, making automated sentiment analysis essential for understanding customer satisfaction. This study aims to classify public sentiment towards the Uber application into positive, neutral, and negative categories using the CatBoost algorithm, a gradient boosting method prioritized for its Ordered Boosting mechanism, which effectively prevents overfitting and enhances the model's generalization capabilities. Despite the use of TF-IDF for numerical text representation, CatBoost is selected for its superior performance on heterogeneous datasets compared to other boosting frameworks like XGBoost and LightGBM. The dataset comprises customer reviews collected 12.000 from the Google Play Store between January and March 2024 using web scraping techniques upload in Kaggle. The data underwent rigorous preprocessing, including lemmatization and TF-IDF vectorization, to structure the textual features, to maximize model performance, hyperparameter optimization was conducted using GridSearchCV. The experimental results demonstrate that the optimization process successfully improved the model's generalization capabilities, raising the Accuracy from 0.907 to 0.910 and the F1-Score from 0.893 to 0.897. Most significantly, the AUC score increased from 0.949 to 0.957, indicating a superior ability to distinguish between sentiment classes. However, while the model exhibited high precision in identifying positive and negative polarities, analysis of the confusion matrix revealed limitations in correctly predicting the neutral class, suggesting challenges related to class imbalance. These findings confirm that an optimized CatBoost model is a robust tool for sentiment classification, though future work is recommended to address minority class detection.

Copyrights © 2025






Journal Info

Abbrev

tiers

Publisher

Subject

Computer Science & IT

Description

TIERS Information Technology Journal memuat artikel Hasil Penelitian dan Studi Kepustakaan dari cabang Teknologi Informasi dengan bidang Sistem Informasi, Artificial Intelligence, Internet of Things, Big Data, e-commerce, Financial Technology, Business ...