Data imbalance is a common challenge in nutritional status prediction because it can reduce classification performance and influence the reliability of Explainable Artificial Intelligence (XAI) interpretations. This study aims to examine the impact of data imbalance on the stability of Local Interpretable Model-Agnostic Explanations (LIME)-based interpretations. A Random Forest model was developed under two scenarios: using the original imbalanced dataset and using a balanced dataset generated through the Synthetic Minority Over-sampling Technique (SMOTE). Model performance was evaluated and compared, followed by LIME-based interpretation and stability analysis. The results indicate that SMOTE enhanced the model’s ability to identify minority classes, with recall increasing from 0.36 to 0.55, although overall accuracy slightly declined. LIME analysis revealed changes in feature contributions between the two scenarios, reflecting the influence of data distribution on model explanations. The interpretation stability score reached 0.80, suggesting relatively consistent explanations despite variations in class balance. These findings highlight the importance of jointly evaluating predictive performance and interpretation stability in health-related machine learning applications.
Copyrights © 2026