Journal of Applied Data Sciences
Vol 6, No 4: December 2025

Stacking Ensemble with SMOTE for Robust Agricultural Commodity Price Prediction under Imbalanced Data

Siagian, Yessica (Unknown)
Hutahaean, Jeperson (Unknown)
Mulyani, Neni (Unknown)



Article Info

Publish Date
13 Sep 2025

Abstract

The volatility of agricultural commodity prices presents a substantial obstacle in the agribusiness sector, especially in supporting timely and data-driven decision-making. This volatility is primarily caused by the imbalanced distribution of historical price data and the complex, often nonlinear nature of price patterns. To address this challenge, this study proposes a novel predictive modeling approach by integrating Stacking Ensemble Learning and Synthetic Minority Over-sampling Technique (SMOTE). The dataset used in this research consists of 5,558 records and 9 features, sourced from a publicly available Kaggle dataset. The target variable daily price was transformed into three classes: low, medium, and high, using a quartile-based discretization approach to enable multiclass classification. The main objective is to evaluate whether stacking combined with SMOTE can improve model performance compared to baseline models that use individual algorithms. A total of eight models were constructed and compared: four baseline models using SMOTE only, and four stacking models integrating SMOTE. The experimental results demonstrate that the proposed model Decision Tree Regression with Stacking and SMOTE achieved the highest performance, with 98.68% accuracy, an F1-score of 0.9868, Cohen’s Kappa of 0.9803, MCC of 0.9803, ROC-AUC of 0.9995, and a log loss of 0.0529. Other optimized models also performed well, such as Random Forest (98.37% accuracy) and Gradient Boosting (98.56%). In contrast, baseline models such as Linear Regression and Decision Tree without stacking achieved only around 67–68% accuracy, with log loss exceeding 0.97. The key contribution of this study is the empirical evidence that combining stacking and SMOTE significantly enhances classification accuracy and model robustness in imbalanced datasets. The novelty lies in applying a deep learning-optimized stacking framework specifically for agricultural commodity price classification, along with a comprehensive multiclass evaluation, offering new insights for practical implementation in agricultural decision support systems.

Copyrights © 2025






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...