Classification in imbalanced and heterogeneous datasets poses significant challenges in informatics, particularly in agricultural domains where minority classes are often underrepresented and feature redundancy affects model performance. This research aims to improve classification performance by developing a stacked ensemble learning framework that integrates probabilistic and tree-based learners to address class imbalance and enhance model interpretability. The framework combines Gaussian Naïve Bayes (GNB), Multinomial Naïve Bayes (MNB), and Random Forest (RF) as base learners with Logistic Regression as the meta-learner. Feature selection was performed using Chi-Square and ReliefF to identify the most relevant predictors, while SMOTE was applied to balance the dataset. Two ensemble configurations were evaluated: Ensemble A (GNB + MNB) and Ensemble B (GNB + RF). Experimental results demonstrate that Ensemble B achieved 97% accuracy and a macro F1-score of 0.97, with a 5.7% accuracy improvement over the best individual classifier and an 18% improvement in minority-class recall. The integration of probabilistic and tree-based models within a stacked architecture provides an interpretable and effective solution for data-driven decision systems in informatics, particularly valuable for domains requiring both high accuracy and model explainability in handling imbalanced datasets.
Copyrights © 2026