General Background: The rapid growth of big data and increasingly complex financial transactions have challenged traditional audit risk assessment methods that rely on sampling and professional judgment. Specific Background: Machine learning and artificial intelligence have recently been adopted in auditing to improve the classification of audit risk within large-scale financial environments. Knowledge Gap: Previous studies remain limited by small datasets, single-model analysis, and insufficient integration of interpretability, robustness testing, and sensitivity analysis in audit risk classification. Aims: This study aims to develop and compare machine learning models for audit risk classification using a dataset of 10,000 audit transactions with 28 audit-related attributes in a big data environment. Results: The findings demonstrate that tree-based and nonlinear models outperformed traditional linear models. The Decision Tree model achieved the highest classification accuracy of 99.07%, while Logistic Regression reached only 68.13%. Feature importance analysis revealed that variance percentage, supporting documents, and prior issues contributed 57.42% of the model’s predictive capability. Repeated validation produced an average accuracy of 98.53% with a low variation of ±0.26%, confirming model stability and robustness. Sensitivity analysis also showed that the model strongly responded to key audit risk indicators. Novelty: This study integrates multiple machine learning models, feature importance evaluation, robustness validation, and sensitivity analysis within a unified audit analytics framework. Implications: The proposed framework provides a practical and interpretable approach for intelligent audit systems, supporting accurate audit risk classification, improved resource allocation, and data-driven auditing practices in big data environments. Highlights: • Decision tree modeling achieved 99.07% classification accuracy on 10,000 audit transactions.• Variance percentage, supporting documents, and prior issues represented 57.42% of predictive capability.• Repeated validation confirmed robust predictive consistency with 98.53% average accuracy and minimal variation. Keywords: Audit Risk Classification, Decision Tree, Machine Learning, Big Data Analytics, Audit Quality
Copyrights © 2026