Hassan Adlan, Amel Zulfukar
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Failure Mode Analysis of Machine Learning Models in Realistic Data Deployment Scenarios Meng Cheng, Lau; Hassan Adlan, Amel Zulfukar
International Journal of Advances in Artificial Intelligence and Machine Learning Vol. 3 No. 1 (2026): International Journal of Advances in Artificial Intelligence and Machine Learni
Publisher : CV Media Inti Teknologi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.58723/ijaaiml.v3i1.651

Abstract

Background: Machine learning models frequently demonstrate strong performance under controlled benchmark evaluations. However, such evaluations often fail to capture hidden vulnerabilities that emerge under realistic deployment conditions. In real-world environments, models are exposed to stressors such as label corruption, feature noise, distributional shifts, and operational constraints, including reduced computational precision and increased latency. These conditions can induce performance degradation and structural instability, highlighting the need for a systematic robustness evaluation framework that goes beyond conventional accuracy metrics.Aims: This paper aims to introduce a formalized Failure Mode Analysis Protocol (FMAP) for evaluating machine learning model robustness under realistic operational stressors. The study reconceptualizes robustness evaluation as a distribution-based process, where model deployment itself generates a new distribution over time.Methods: The proposed FMAP framework evaluates model behavior under progressively adverse conditions, including symmetric label corruption, additive feature noise, distributional shifts, and operational constraints such as reduced numerical precision and increased inference latency. Experiments were conducted across diverse tabular and image benchmark datasets using representative model architectures, including linear models, ensemble methods, margin-based models, and deep neural networks.Result: The experiments reveal distinct robustness profiles across model architectures when exposed to escalating stress conditions. Operational constraints and compositional limitations were shown to induce measurable degradation patterns, including instability and output collapse under extreme stress. The findings demonstrate that model failure is not solely a function of predictive accuracy loss but is closely linked to operational constraints and evolving distributional conditions. The distribution-based evaluation framework effectively captures early-stage degradation and full failure transitions.Conclusion: This study establishes a structured protocol for analyzing machine learning failure modes under realistic deployment scenarios. By framing robustness evaluation as a distribution-based process, the FMAP approach provides a systematic method for identifying operational risks and structural vulnerabilities.