Claim Missing Document
Check
Articles

Found 5 Documents
Search
Journal : Jurnal Teknik Informatika (JUTIF)

Improving Term Deposit Customer Prediction Using Support Vector Machine with SMOTE and Hyperparameter Tuning in Bank Marketing Campaigns Abidin, Dodo Zaenal; Rosario , Maria; Sadikin , Ali; Nurhadi, Nurhadi; Jasmir, Jasmir
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 3 (2025): JUTIF Volume 6, Number 3, Juni 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.3.4585

Abstract

Identifying potential customers for term deposit products remains a challenge in the banking industry due to class imbalance in marketing datasets. This study proposes an integrated approach that combines Support Vector Machine (SVM) with the Synthetic Minority Oversampling Technique (SMOTE) and hyperparameter tuning via GridSearchCV to enhance prediction performance. The dataset comprises 45,211 records containing demographic and campaign-related features. Preprocessing steps include categorical encoding, feature scaling, and SMOTE-based resampling. The optimized SVM model achieves an accuracy of 91% and an AUC of 0.96, outperforming the baseline model and demonstrating strong discriminatory ability, particularly for the minority class. This method improves the balance between precision and recall while reducing bias toward the majority class. The findings confirm the effectiveness of combining SMOTE and SVM for imbalanced classification tasks in the financial domain. These results contribute to the advancement of applied machine learning in informatics, particularly in developing robust decision support systems for data-driven banking strategies. Future work may extend this approach to diverse datasets and explore advanced resampling or ensemble techniques to improve model generalization.
A Comprehensive Benchmarking Pipeline for Transformer-Based Sentiment Analysis using Cross-Validated Metrics Abidin, Dodo Zaenal; Afuan, Lasmedi; Toscany, Afrizal Nehemia; Nurhadi, Nurhadi
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 4 (2025): JUTIF Volume 6, Number 4, Agustus 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.4.4894

Abstract

Transformer-based models have significantly advanced sentiment analysis in natural language processing. However, many existing studies still lack robust, cross-validated evaluations and comprehensive performance reporting. This study proposes an integrated benchmarking pipeline for sentiment classification on the IMDb dataset using BERT, RoBERTa, and DistilBERT. The methodology includes systematic preprocessing, stratified 5-fold cross-validation, and aggregate evaluation through confusion matrices, ROC and precision-recall (PR) curves, and multi-metric classification reports. Experimental results demonstrate that all models achieve high accuracy, precision, recall, and F1-score, with RoBERTa leading overall (94.1% mean accuracy and F1), followed by BERT (92.8%) and DistilBERT (92.1%). All models exceed 0.97 in ROC-AUC and PR-AUC, confirming strong discriminative capability. Compared to prior approaches, this pipeline enhances result robustness, interpretability, and reproducibility. The provided results and open-source code offer a reliable reference for future research and practical deployment. This study is limited to the IMDb dataset in English, suggesting future work on multilingual, cross-domain, and explainable AI integration.
Enhancing Fake News Detection on Imbalanced Data Using Resampling Techniques and Classical Machine Learning Models Abidin, Dodo Zaenal; Siswanto, Agus; Saputra, Chindra; Betantiyo , Betantiyo; Nehemia Toscany, Afrizal
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 5 (2025): JUTIF Volume 6, Number 5, Oktober 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.5.5177

Abstract

Class imbalance remains a critical challenge in fake news detection, particularly in domains such as entertainment media where class distributions are highly skewed. This study evaluates seven resampling techniques—Random Oversampling, SMOTE, ADASYN, Random Undersampling, Tomek Links, NearMiss, and No Resampling—applied to three classical machine learning models: Logistic Regression, Support Vector Machine (SVM), and Random Forest. Using the imbalanced GossipCop dataset comprising 24,102 news headlines, the proposed pipeline integrates TF-IDF vectorization, stratified 3-fold cross-validation, and five evaluation metrics: F1-score, precision, recall, ROC AUC, and PR AUC. Experimental results show that oversampling methods, particularly SMOTE and Random Oversampling, substantially improve minority class (fake news) detection. Among all model–resampling combinations, SVM with SMOTE achieved the highest performance (F1-score = 0.67, PR AUC = 0.74), demonstrating its robustness in handling imbalanced short-text classification. Conversely, undersampling methods frequently reduced recall, especially with ensemble models like Random Forest. This approach enhances model robustness in fake news detection on skewed datasets and contributes a reproducible, domain-specific framework for developing more reliable misinformation classifiers.
K-Means Clustering with Elbow Method and Validity Indices for Classifying Student Academic Achievement Based on Knowledge Scores at SDN 48 Kota Jambi Azmi, M. Fikri; Abidin, Dodo Zaenal; Jasmir, Jasmir
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 1 (2026): JUTIF Volume 7, Number 1, February 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.1.5349

Abstract

Student performance evaluation at SDN 48 Kota Jambi has been traditionally conducted manually, which is inefficient and often subjective. This study aims to provide an objective classification of students’ academic achievement using data-driven methods. The research applies the Knowledge Discovery in Databases (KDD) framework, which involves data selection, preprocessing, clustering, and evaluation. The dataset consists of knowledge scores from 152 elementary students across seven subjects, obtained from the Merdeka Curriculum report cards. Data preprocessing included cleaning and normalization to ensure consistency. K-Means clustering was implemented using RapidMiner, with the optimal number of clusters determined through the Elbow Method. Cluster validity was assessed using the Davies–Bouldin Index (1.226) and the Silhouette Coefficient (0.245). The results produced three clusters: high achievers (30.9%), medium achievers (27.0%), and low achievers (42.1%). Centroid analysis indicated that Mathematics and Physical Education were the most discriminative subjects across groups. These findings highlight a substantial proportion of students requiring remedial intervention and support differentiated learning strategies. The contribution of this research lies in applying educational data mining techniques to an elementary school context in Jambi, integrating both quantitative indices and qualitative validation with teachers. The study demonstrates that clustering methods can enhance educational decision-making, providing a basis for adaptive teaching, targeted interventions, and resource allocation in elementary education.
Optimized RoBERTa–DeBERTa Ensemble for Multi-Class Sentiment Analysis on Highly Imbalanced Data Sika, Xaverius; Kisbianty, Desi; Istoningtyas, Marrylinteri; Abidin, Dodo Zaenal; Toscany, Afrizal Nehemia
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 2 (2026): JUTIF Volume 7, Number 2, April 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.2.5350

Abstract

Multi-class sentiment analysis on highly imbalanced datasets poses substantial challenges for achieving accurate and equitable classification, particularly when neutral sentiments are considerably underrepresented. This study evaluates four fine-tuned transformer models—Bidirectional Encoder Representations from Transformers (BERT), DistilBERT, RoBERTa, and DeBERTa—using a real-world Amazon review dataset comprising over 20,000 user-generated texts. Sentiment labels were derived from star ratings through a standardized mapping scheme. Experimental results show that while BERT achieved the highest overall accuracy (93%), its performance on the minority Neutral class remained limited (F1-score: 0.36). DeBERTa improved Neutral recall to 0.59 but with a slightly lower overall accuracy of 91%. To address this imbalance, two ensemble strategies were explored: a fixed-weight soft voting scheme and an optimized-weight ensemble combining RoBERTa and DeBERTa. The optimized RoBERTa–DeBERTa ensemble yielded the most balanced performance, achieving a Neutral-class F1-score of 0.57 while maintaining 91% overall accuracy. ROC and PR curve analyses further indicate superior sensitivity–precision balance for this optimized ensemble. The findings indicate that adaptive ensemble weighting can substantially enhance minority-class detection under severe imbalance. This study provides a clear methodological contribution by demonstrating the effectiveness of targeted ensemble optimization and offers practical guidance for developing more balanced and reliable sentiment classification systems.