This research investigates the efficacy of employing the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework to analyze sentiment classification models. The study focuses on evaluating the performance of Decision Trees (DT) and Support Vector Machine (SVM) models integrated with the Synthetic Minority Over-sampling Technique (SMOTE) across various performance metrics, including accuracy, precision, recall, f-measure, and Area Under the Curve (AUC). Using CRISP-DM, the research ensures a systematic data preprocessing, modeling, and evaluation approach. The findings reveal that both DT and SVM models with SMOTE achieve high accuracy rates, with DT yielding an accuracy of 98.37% +/- 0.48% and SVM achieving 98.91% +/- 0.59%. These models effectively distinguish between positive and negative sentiments, as precision, recall, and f-measure scores indicate. Additionally, the AUC scores underscore the robustness of the models in sentiment analysis tasks. These results highlight the potential of CRISP-DM as a structured methodology for sentiment classification research, providing insights into the performance of different machine learning algorithms in handling imbalanced datasets. Based on these findings, it is recommended that future studies further explore the application of CRISP-DM in sentiment analysis tasks and investigate the scalability of DT and SVM models with SMOTE in larger datasets.
Copyrights © 2024