Water quality analysis plays an important role in determining the suitability of water for human consumption. This study aims to build a machine learning model that is able to classify water quality based on several parameters such as pH, hardness, solids content, chloramines, sulfate, conductivity, organic carbon, trihalomethanes, and turbidity. The dataset used comes from Kaggle with a total of 3,276 sample data. The two main algorithms applied in this study are Support Vector Machine (SVM) and CatBoost. The research process includes data preprocessing, data balancing using SMOTE, modeling, and model performance evaluation. Hyperparameter tuning is applied to both algorithms to improve performance. The results show that CatBoost has the best performance with an accuracy of 95.8% after hyperparameter tuning, compared to SVM which achieved an accuracy of 77.9%. In addition, CatBoost excels in all evaluation metrics, including precision, recall, and F1-score.
Copyrights © 2025