The issue of increasing waste due to the growing population and human activities presents a serious challenge in waste management in Central Java. One of the main obstacles in waste prediction research is the prevalence of missing data, which can reduce the accuracy of predictive models. This study employs three methods to handle missing values: Mean Imputation, Interpolation, and KNN Imputer. Once the missing values are filled using these methods, the next step is to calculate the prediction values. The study utilizes three predictive models: Random Forest, Gradient Boosting, and KNN. The results indicate that with Mean Imputation, the Random Forest model shows the best performance with an RMSE of 0.349. When using Interpolation for missing values, the Gradient Boosting model becomes the best choice with an RMSE of 0.543. Meanwhile, with KNN Imputer, the Gradient Boosting model again performs the best with an RMSE of 0.188. Based on this research, the most effective approach is using KNN Imputer for handling missing values in conjunction with the Gradient Boosting model. This combination provides the lowest RMSE for similar datasets.
Copyrights © 2024