Garuda - Garba Rujukan Digital

KLIK: Kajian Ilmiah Informatika dan Komputer

Vol. 4 No. 5 (2024): April 2024

Singgalen, Yerik Afrianto (Unknown)

Publish Date
30 Apr 2024

This research investigates the efficacy of employing the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework to analyze sentiment classification models. The study focuses on evaluating the performance of Decision Trees (DT) and Support Vector Machine (SVM) models integrated with the Synthetic Minority Over-sampling Technique (SMOTE) across various performance metrics, including accuracy, precision, recall, f-measure, and Area Under the Curve (AUC). Using CRISP-DM, the research ensures a systematic data preprocessing, modeling, and evaluation approach. The findings reveal that both DT and SVM models with SMOTE achieve high accuracy rates, with DT yielding an accuracy of 98.37% +/- 0.48% and SVM achieving 98.91% +/- 0.59%. These models effectively distinguish between positive and negative sentiments, as precision, recall, and f-measure scores indicate. Additionally, the AUC scores underscore the robustness of the models in sentiment analysis tasks. These results highlight the potential of CRISP-DM as a structured methodology for sentiment classification research, providing insights into the performance of different machine learning algorithms in handling imbalanced datasets. Based on these findings, it is recommended that future studies further explore the application of CRISP-DM in sentiment analysis tasks and investigate the scalability of DT and SVM models with SMOTE in larger datasets.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

KLIK: Kajian Ilmiah Informatika dan Komputer

Website

Abbrev

klik

Publisher

Sekolah Tinggi Manajemen Informatika dan Komputer Budi Darma

Subject

Computer Science & IT

Description

Topik utama yang diterbitkan mencakup: 1. Teknik Informatika 2. Sistem Informasi 3. Sistem Pendukung Keputusan 4. Sistem Pakar 5. Kecerdasan Buatan 6. Manajemen Informasi 7. Data Mining 8. Big Data 9. Jaringan Komputer 10. Dan lain-lain (topik lainnya yang berhubungan dengan Teknologi Informati dan ...

Article Info

Abstract

Comparative Analysis of DT and SVM Model Performance with SMOTE in Sentiment Classification

Article Info

Abstract