Jurnal Sisfokom (Sistem Informasi dan Komputer)
Vol. 15 No. 02 (2026): MAY

Dropout Prediction Using KNN, Decision Tree, Naive Bayes, and Ensemble Learning: A Comparative Performance Analysis with Synthetic Data Validation

Norma Puspitasari (Department of Software Engineering Technology, Indonusa Polytechnic)
Mochammad Agung Wibowo (Department of Software Engineering Technology, Indonusa Polytechnic)
Budi Warsito (Department of Software Engineering Technology, Indonusa Polytechnic)



Article Info

Publish Date
01 Apr 2026

Abstract

Student dropout is a critical issue in higher education because it affects institutional performance, resource allocation, and student success. Early identification of students with a high risk of dropout enables institutions to design timely academic and non-academic interventions. However, predicting dropout is challenging due to the complexity of influencing factors and class imbalance in educational data. This study presents a comparative performance analysis of four machine learning algorithms—K-Nearest Neighbor (KNN), Decision Tree (DT), Naive Bayes (NB), and an Ensemble Weighted Voting classifier—to support the development of an effective dropout prediction model. Due to restricted access to complete non-dropout student records, this study integrates real institutional withdrawal data from 2023–2024 to calibrate dropout characteristics and employs a transparently generated synthetic dataset for methodological validation. The dataset consists of 300 instances and is processed using the SMOTE technique to address class imbalance. Model performance is evaluated using accuracy, precision, recall, F1-score, and AUC. The experimental results obtained from synthetic validation indicate that the ensemble model outperforms individual classifiers, achieving an accuracy of 0.97, precision of 1.00, recall of 0.86, F1-score of 0.92, and AUC of 0.93. These findings highlight the potential of ensemble learning as a robust approach for early-warning systems in higher education while providing a transparent framework for predictive modeling under data-access constraints.

Copyrights © 2026






Journal Info

Abbrev

sisfokom

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

Jurnal Sisfokom merupakan singkatan dari Jurnal Sistem Informasi dan Komputer. Jurnal ini merupakan kolaborasi antara sivitas akademika STMIK Atma Luhur dengan perguruan tinggi maupun universitas di Indonesia. Jurnal ini berisi artikel ilmiah dari peneliti, akademisi, serta para pemerhati TI. Jurnal ...