Water quality monitoring is a crucial element in data-driven environmental management. This study aims to identify the most important parameters in river water quality classification through feature selection and machine learning approaches. Eleven physicochemical parameters were used as initial features, and two selection methods were applied: Genetic Algorithm (GA) and Spearman Rank Correlation (RS). Classification was performed using Radial Basis Function Support Vector Machine (RBF-SVM), with performance evaluation based on accuracy, F1 score, and recall. GA testing results identified influential parameters (pH, DHL, DO, BOD, COD, TSS, NO₂⁻-N), achieving an accuracy of 96.67% and an F1 score of 0.82. RS generated seven different features with an accuracy of 90.00% and an F1 score of 0.67. Both methods revealed five consistently significant features (DHL, BOD, COD, TSS, NO₂⁻-N), which are the influential features. The model without feature selection, despite producing high accuracy (93.33%), only achieved an F1 score of 0.48, indicating poor recognition of the minority class. These findings confirm that feature selection improves classification efficiency and capability. In conclusion, GA-based feature selection provides the most effective subset for water quality classification and supports the development of intelligent and cost-effective monitoring systems suitable for sensor-based field applications.
Copyrights © 2025