Gunardi Gunardi
Universitas Dinamika Bangsa Indonesia

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Improving Bioethanol Sentiment Analysis Performance using SMOTE in Machine Learning Model Comparison Rajhu Ilham Pradana; Jasmir Jasmir; Gunardi Gunardi
Sistemasi: Jurnal Sistem Informasi Vol 15, No 5 (2026): Sistemasi: Jurnal Sistem Informasi
Publisher : Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32520/stmsi.v15i5.6300

Abstract

Sentiment analysis of public policies on social media is crucial for government evaluation; however, it is often challenged by highly imbalanced datasets. This study aims to address this issue through a case study on public sentiment toward bioethanol fuel policies on YouTube, where the cleaned dataset after preprocessing consisted of 2,409 comments dominated by negative sentiment (1,430 comments), followed by neutral sentiment (734 comments), and only a small number of positive sentiments (245 comments). The performance of classical Machine Learning (ML) models was severely degraded due to this imbalance, particularly in detecting the minority class. This study applied TF-IDF weighting for feature extraction, followed by the Synthetic Minority Oversampling Technique (SMOTE) to balance the training data (1,927 samples) before comparing the performance of three ML algorithms: Logistic Regression, Support Vector Machine (SVM), and LightGBM. The evaluation results on the testing dataset (482 samples) demonstrate that the implementation of SMOTE significantly improved the models’ ability to recognize the “Positive” class. The LightGBM model combined with SMOTE achieved the best performance, with an accuracy of 64.11%. In particular, the application of SMOTE successfully increased the minority-class F1-score from a baseline of 18.18% to 35.29%. These findings confirm that handling imbalanced data is a critical step in producing valid and reliable sentiment analysis results.
Comparison of Machine Learning Algorithms for Credit Score-based Banking Customer Churn Prediction Suryadillah Hendrawinata; Jasmir Jasmir; Gunardi Gunardi
Sistemasi: Jurnal Sistem Informasi Vol 15, No 5 (2026): Sistemasi: Jurnal Sistem Informasi
Publisher : Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32520/stmsi.v15i5.6148

Abstract

A high customer churn rate represents a significant challenge for the banking industry, leading to substantial financial losses and higher acquisition costs for new customers. Proactively identifying customers who are likely to churn is essential for implementing effective retention strategies. This study aims to address this issue by implementing and comprehensively comparing three different machine learning classification algorithms: Logistic Regression, Random Forest, and XGBoost. The study utilized a secondary dataset consisting of bank customer profiles from 10,000 customers with various characteristics, including credit scores, account balances, and transaction activities. The research methodology followed the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework. The models were evaluated using several metrics, including Accuracy, Precision, Recall, F1-Score, and ROC-AUC. The findings indicate that the ensemble models significantly outperformed the linear model (Logistic Regression), which achieved an F1-Score of only 0.286. Random Forest emerged as the best-performing model in this study, achieving the highest Accuracy (0.864), F1-Score (0.590), and ROC-AUC (0.852). In comparison, XGBoost demonstrated competitive performance with an F1-Score of 0.579 and a ROC-AUC of 0.832. The study concludes that Random Forest provides the most optimal overall performance, offering the strongest capability for identifying at-risk customers within the dataset.