Journal of Digital Market and Digital Currency
Vol. 1 No. 1 (2024): Regular Issue June 2024

Comparative Analysis of Sentiment Classification Techniques on Flipkart Product Reviews: A Study Using Logistic Regression, SVC, Random Forest, and Gradient Boosting

Henderi (Unknown)
Siddique, Quba (Unknown)



Article Info

Publish Date
26 May 2024

Abstract

Sentiment analysis plays a crucial role in e-commerce, providing valuable insights from customer reviews on platforms like Flipkart. This study aims to compare the effectiveness of various sentiment classification techniques, specifically Logistic Regression, Support Vector Classifier (SVC), Random Forest, and Gradient Boosting. The dataset, collected from Flipkart, consists of 205,052 product reviews spanning various categories. Key data preprocessing steps included handling missing values, removing duplicates, normalizing text, and applying TF-IDF vectorization for feature extraction. We implemented and tuned the hyperparameters for each algorithm using grid search and randomized search. The data was divided into training and testing sets with an 80-20 split, and cross-validation techniques ensured robust model evaluation. The performance of each model was assessed using several metrics: accuracy, precision, recall, F1-score, and ROC-AUC. The results revealed that Logistic Regression achieved an accuracy of 0.8995, precision of 0.8773, recall of 0.8995, an F1 score of 0.8736, and a ROC AUC score of 0.9105. The SVC model showed slightly higher accuracy at 0.8997, precision of 0.8619, recall of 0.8997, and an F1 score of 0.8738. The Random Forest model, while robust, had lower accuracy (0.7953) and struggled with precision (0.6326), recall (0.7953), and an F1 score of 0.7047, but achieved a ROC AUC score of 0.9037. Gradient Boosting performed comparably to Logistic Regression with an accuracy of 0.8993, precision of 0.8512, recall of 0.8993, an F1-score of 0.8735, and a ROC AUC score of 0.9098. Comparative analysis identified SVC and Logistic Regression as top performers, balancing accuracy and computational efficiency. These findings suggest that implementing these models can significantly enhance sentiment analysis in e-commerce, improving customer insights and business strategies. Future research should explore advanced deep learning techniques and address class imbalances to further refine sentiment analysis capabilities.

Copyrights © 2024






Journal Info

Abbrev

JDMDC

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Economics, Econometrics & Finance

Description

Journal of Digital Market and Digital Currency publishes high-quality research on: Digital Marketing Digital Currencies Cryptocurrency Trends Blockchain Applications Fintech Innovations Our goal is to provide a platform for researchers, practitioners, and policymakers to share innovative findings, ...