Anomaly detection has become an essential aspect of modern machine learning, particularly in scenarios where labeled data is scarce or unavailable. This study presents a comparative analysis between two widely used unsupervised algorithms: One-Class Support Vector Machine (OCSVM) and Isolation Forest. Using the MNIST dataset as a benchmark, the evaluation focuses on score distribution, training time, precision measured by ROC-AUC, and sensitivity to data variations. The results demonstrate distinct trade-offs between the two approaches. OCSVM produces a centralized score distribution (0.4–0.5) and achieves superior classification performance with a ROC-AUC of 0.92, which is statistically significant (p < 0.05 by DeLong’s test). This indicates that OCSVM is highly effective in identifying structural deviations, making it suitable for applications requiring strict data validation and reliability, such as fraud detection and critical quality control. However, this higher accuracy comes at the cost of computational efficiency, as OCSVM requires approximately 120 seconds for training. In contrast, Isolation Forest yields a more spread score distribution (0.3–0.7) and slightly lower precision (ROC-AUC 0.85), but it significantly reduces training time to just 60 seconds. Moreover, its high sensitivity to minor variations highlights its advantage in real-time anomaly detection and large-scale datasets where speed and adaptability are crucial. Overall, the findings emphasize that OCSVM excels in precision-driven applications, while Isolation Forest is more advantageous for scenarios that demand scalability and computational efficiency. These insights provide a practical guideline for selecting appropriate anomaly detection methods depending on application requirements.
Copyrights © 2025