Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : TIERS Information Technology Journal

Optimization CatBoost using GridSearchCV for Sentiment Analysis Customer Reviews in Digital Transportation Industry Ifriza, Yahya Nur; Sanusi, Ratna Nur Mustika; Febriyanto, Hendra; Kamaruddin, Azlina
TIERS Information Technology Journal Vol. 6 No. 2 (2025)
Publisher : Universitas Pendidikan Nasional

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.38043/tiers.v6i2.7201

Abstract

The rapid expansion of ride-hailing services has generated a massive volume of user feedback, making automated sentiment analysis essential for understanding customer satisfaction. This study aims to classify public sentiment towards the Uber application into positive, neutral, and negative categories using the CatBoost algorithm, a gradient boosting method prioritized for its Ordered Boosting mechanism, which effectively prevents overfitting and enhances the model's generalization capabilities. Despite the use of TF-IDF for numerical text representation, CatBoost is selected for its superior performance on heterogeneous datasets compared to other boosting frameworks like XGBoost and LightGBM. The dataset comprises customer reviews collected 12.000 from the Google Play Store between January and March 2024 using web scraping techniques upload in Kaggle. The data underwent rigorous preprocessing, including lemmatization and TF-IDF vectorization, to structure the textual features, to maximize model performance, hyperparameter optimization was conducted using GridSearchCV. The experimental results demonstrate that the optimization process successfully improved the model's generalization capabilities, raising the Accuracy from 0.907 to 0.910 and the F1-Score from 0.893 to 0.897. Most significantly, the AUC score increased from 0.949 to 0.957, indicating a superior ability to distinguish between sentiment classes. However, while the model exhibited high precision in identifying positive and negative polarities, analysis of the confusion matrix revealed limitations in correctly predicting the neutral class, suggesting challenges related to class imbalance. These findings confirm that an optimized CatBoost model is a robust tool for sentiment classification, though future work is recommended to address minority class detection.