Journal of Business, Social and Technology
Vol. 7 No. 2 (2026): Journal of Business, Social and Technology

Comparative Analysis of Supervised Learning and Unsupervised Anomaly Detection in Security Log Analysis for Post-Incident Digital Forensic Investigation

Indramana, Iwan (Unknown)
Purwanto, Asto (Unknown)



Article Info

Publish Date
23 Apr 2026

Abstract

Background: Attempts to perform post-incident digital forensic investigation on large-scale security logs generated by enterprise firewalls and servers introduce a range of challenges. As data grows larger and more complex, it is no longer feasible to conduct manual analysis. Methodologically, there has been only limited empirical work directly comparing supervised and unsupervised paradigms for use in a post-incident forensic framework on operational-scale, real-world logs. Objective: This paper compares the classification performance of supervised and unsupervised machine learning methods for forensic analysis of security logs, as well as the prioritization of various security anomalies using both approaches. Methods: Analysis of a dataset containing more than 359,000 firewall and server logs obtained over a 30-day period. Labeled events were used to implement a supervised model, Logistic Regression; Isolation Forest is an unsupervised anomaly detection method, which performs best among the models trained on normal baseline logs. Evaluation metrics included accuracy, precision, recall, ROC-AUC, and ranking-based anomaly assessment. Results: Logistic Regression — accuracy (0.99), ROC-AUC (0.9998), precision/recall for suspicious events (1.00, 0.99) — demonstrated near-perfect discriminability of labeled behavioral features within a 24-hour period. Isolation Forest: 86% overall accuracy, 93% precision, 59% recall; excellent forensic triage property: confirmed suspicious events among the top 200 anomaly-ranked entries: 197 of 200 (92.5%). Sensitivity analysis of the contamination parameter showed that ranking precision at the top 200 remained stable within the 0.05 to 0.30 range (Fig. 7A, 7B), demonstrating the robustness of rank-based prioritization despite variability in global recall across contamination values. Conclusion: Our results demonstrate high predictive performance for supervised classification and efficient forensic triage through low false-positive rates in unsupervised anomaly detection of both time-series logs and free-text security event logs.

Copyrights © 2026






Journal Info

Abbrev

jbt

Publisher

Subject

Economics, Econometrics & Finance Industrial & Manufacturing Engineering Social Sciences

Description

This journal publishes research articles covering all aspects of information technology, information systems, agricultural technology, computer social and political sciences, and economics that belong to the business, social, and technological ...