The distribution of funds for The Indonesia Smart Program (Program Indonesia Pintar, or PIP), as a national education assistance program, faces serious challenges related to the potential for fraud that can harm the state and hinder the goal of equitable access to education. This study aims to develop a machine learning-based predictive model to detect potential fraud in the distribution of PIP funds by comparing two main algorithms, Naive Bayes and Support Vector Machine (SVM). The dataset used is the result of the integration of PIP and DAPODIK data in 2023, as well as additional features of engineering results based on the pattern of audit findings. All data, through preprocessing, normalization, and balancing processes, uses SMOTE to overcome class imbalances. The model was evaluated using accuracy, precision, recall, and F1-score metrics, both on internal and external test data from Banten Province. The results showed that SVMs with RBF kernel and optimal parameter tuning provided the best performance with an accuracy of up to 98.5% on test data. At the same time, Naive Bayes tended to be more sensitive to changes in data distribution in new data. Features such as recipient differences, budget checks, and stakeholder proposals have proven to be the leading indicators in detecting fraud. This study emphasizes the importance of external validation and regular model updates so that fraud detection systems remain adaptive to data dynamics in the field. The resulting model can be used as a tool for supervision and decision-making to prevent fraud in distributing education funds.
Copyrights © 2025