This study aims to assess the effectiveness of linear regression algorithm in predicting raw water demand by considering customer transaction data, raw water volume, and seasonal variables. The method used is Knowledge Discovery in Databases (KDD), including data selection, preprocessing, transformation, data mining, and result evaluation. The dataset is divided 80% for training and 20% for testing. The analysis results show that the linear regression model has a coefficient of determination (R²) of 0.77, which means that the model can explain 77% of the data variability. The prediction error value is low, with Mean Absolute Error (MAE) 0.06, Mean Squared Error (MSE) 0.01, and Root Mean Squared Error (RMSE) 0.08, indicating good accuracy. In the comparison between actual and predicted values, for actual data of 7,000 liters, the model predicts 7,984.70 liters. The variable number of customer transactions has the greatest influence on raw water demand, with a coefficient of 16,940.46, while seasonal factors have less influence. Based on these findings, it can be concluded that the linear regression algorithm is effective in predicting raw water demand, however further development is required to improve accuracy at extreme values, by adding variables or using more complex algorithms.
Copyrights © 2025