Claim Missing Document
Check
Articles

Found 1 Documents
Search

Evektivitas Xgboost Lightgbm dan Catboost pada Dataset Imbalanced Predictive Maintenance Moeng Sakmar; Kadir, Nurul Tiara; Puteri Awaliatush Shofo; Agus Darmawan
Jurnal SINTA: Sistem Informasi dan Teknologi Komputasi Vol. 3 No. 1 (2026): SINTA: JANUARI
Publisher : Berkah Tematik Mandiri

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.61124/sinta.v3i1.145

Abstract

In the era of Industry 4.0, unexpected machine failures have become a critical challenge, triggering unplanned downtime and significant financial losses for the manufacturing sector. A fundamental obstacle in the development of Machine Learning-based Predictive Maintenance systems is data imbalance, where damage incidents occur much less frequently than normal conditions, causing models to become biased and fail to recognize vital anomalies. This study aims to evaluate the effectiveness of the Synthetic Minority Over-sampling Technique (SMOTE) in optimizing failure detection performance on the AI4I 2020 dataset. It uses a comparative approach with three Gradient Boosting algorithms: XGBoost, LightGBM, and CatBoost. This study highlights the Accuracy Paradox phenomenon in scenarios without resampling, where high spurious accuracy masks the model's inability to detect failures or low Recall. The findings of this study show that the integration of SMOTE successfully reconstructs the model's decision boundaries, thereby significantly increasing sensitivity to minority classes. Based on an in-depth analysis using the Confusion Matrix, the XGBoost algorithm combined with SMOTE was identified as the most optimal model, as it effectively balanced critical trade-offs by achieving a high Recall to ensure asset safety, while minimizing false alarms (False Positives) that impact technician work efficiency, compared to its competitors. This study concludes that addressing data imbalance is a deterministic step in building a predictive maintenance system that is not only technically precise but also reliable and safe for implementation in real industrial ecosystems.