Darmawan, Aditya Aqil
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Pendekatan Machine Learning dengan Teknik Stacking untuk Memprediksi Kualitas Air MinumPendekatan Machine Learning dengan Teknik Stacking untuk Memprediksi Kualitas Air Minum D, Ishak Bintang; Andono, Pulung Nurtantio; Pramunendar, Ricardus Anggi; Winarno, Agus; Darmawan, Aditya Aqil
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.7014

Abstract

Safe drinking water quality is essential for public health, yet environmental pollution has significantly degraded its quality. Manual methods such as WQI and STORET are inefficient, prompting this study to propose a machine learning-based classification system for more accurate water potability assessment. The Water Potability dataset from Kaggle is used, consisting of 3,276 samples with nine key parameters. The preprocessing stage includes data imputation, normalization, feature engineering, and oversampling with SMOTE. The applied models include LGBM, Random Forest, GBM, and XGBoost, optimized using Bayesian techniques and stacking ensemble to enhance accuracy. Results show that the stacking ensemble achieves an accuracy of 85.38%, precision of 88.02%, recall of 85.38%, and F1-score of 85.23%, outperforming individual models. This system enables real-time water quality monitoring with faster and more accurate results, supporting decision-making in sanitation policies and clean water availability.
Klasifikasi Kelayakan Air Minum Mengkombinasikan Algoritma Random Forest dengan Teknik Optimasi Bayesian Darmawan, Aditya Aqil; D, Ishak Bintang; Astuti, Yani Parti; Winarno, Agus
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.7055

Abstract

The quality of clean and safe drinking water is crucial for public health; however, environmental pollution from industrial waste, domestic waste, and urbanization has significantly deteriorated water quality. Manual methods for water quality analysis, such as the Water Quality Index (WQI) and STORET, have limitations in efficiency and accuracy. Therefore, this study proposes a machine learning-based classification system to determine the potability of drinking water more accurately and efficiently. The Water Potability dataset from Kaggle, consisting of 3,276 samples with nine key parameters, was used in this research. Initial analysis showed that most features had a nearly normal distribution, although some variables, such as Solids and Conductivity, exhibited right-skewness due to extreme values. Correlation analysis revealed no significant linear relationships between water quality parameters. The preprocessing stage included missing data imputation using the mean method, normalization, feature engineering, and oversampling with SMOTE to address class imbalance. The machine learning models used in this study include LightGBM, Random Forest, XGBoost, and CatBoost, with model optimization performed using Bayesian Search CV, which improved performance, particularly for Random Forest. Experimental results showed that the optimized Random Forest model achieved the best performance with an accuracy of 85.38%, precision of 85.86%, recall of 85.38%, and an F1-score of 85.37%. However, some misclassifications remained, especially in detecting potable water samples, indicating that ensemble learning methods can be effectively used to evaluate drinking water potability.