This research aims to predict taxpayer compliance (compliant or non-compliant) using 2,167 rows of tax data. The CRISP-DM (Cross-Industry Standard Process for Data Mining) framework was used to guide the process, as it has a structured framework. Five machine learning algorithms were compared, namely Naive Bayes, Support Vector Machine (SVM), Decision Tree, Logistic Regression, and Deep Learning, trained and tested using RapidMiner tools. To improve the prediction accuracy, the majority voting ensemble method which is the simplest and most efficient ensemble is used by combining the prediction results of these algorithms and evaluated and implemented on Google Collab using Python to validate the performance on new data and successfully provide more stable accuracy than individual models. This research contributes to tax data management, especially policy makers can optimize the use of technology to improve the efficiency of the process of monitoring and evaluating taxpayer compliance. This research also underscores the importance of exploring various machine learning algorithms and ensembles and other parameters to produce effective solutions in the field of taxation.
Copyrights © 2025