Civil Engineering Journal
Vol 7, No 9 (2021): September

A Comparison of Multiple Imputation Methods for Recovering Missing Data in Hydrological Studies

Fatimah Bibi Hamzah (1) Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, 43600 UKM, Bangi Selangor, Malaysia. 2) Faculty of Computing and Multimedia, Kolej Universiti Poly-Tech Mara Kuala Lumpur, Jalan 6/91, Taman Shamelin Perkasa, 56100 Kual)
Firdaus Mohd Hamzah (Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, 43600 UKM, Bangi Selangor,)
Siti Fatin Mohd Razali (Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, 43600 UKM, Bangi Selangor,)
Hafiza Samad (Faculty of Computing and Multimedia, Kolej Universiti Poly-Tech Mara Kuala Lumpur, Jalan 6/91, Taman Shamelin Perkasa, 56100 Kuala Lumpur,)



Article Info

Publish Date
01 Sep 2021

Abstract

Missing data is a common problem in hydrological studies; therefore, data reconstruction is critical, especially when it is crucial to employ all available resources, even incomplete records. Furthermore, missing data could have an impact on statistical analysis results, and the amount of variability in the data would not be fittingly anticipated. As a result, this study compared the performance of three imputation methods in predicting recurrence in streamflow datasets: robust random regression imputation (RRRI), k-nearest neighbours (k-NN), and classification and regression tree (CART). Furthermore, entire historical daily streamflow data from 2012 to 2014 (as training dataset) were utilised to assess and validate the effectiveness of the imputation methods in addressing missing streamflow data. Following that, all three methods coupled with multiple linear regression (MLR), were used to restore streamflow rates in Malaysia's Langat River Basin from 1978 to 2016. The estimation techniques effectiveness was evaluated using metrics inclusive of the Nash-Sutcliffe efficiency coefficient (CE), root-mean-square error (RMSE), and mean absolute percentage error (MAPE). The results confirmed that RRRI coupled with MLR (RRRI-MLR) had the lowest RMSE and MAPE values, outperforming all other techniques tested for filling missing data in daily streamflow datasets. This indicates that the RRRI-MLR is the best method for dealing with missing data in streamflow datasets. Doi: 10.28991/cej-2021-03091747 Full Text: PDF

Copyrights © 2021






Journal Info

Abbrev

cej

Publisher

Subject

Civil Engineering, Building, Construction & Architecture

Description

Civil Engineering Journal is a multidisciplinary, an open-access, internationally double-blind peer -reviewed journal concerned with all aspects of civil engineering, which include but are not necessarily restricted to: Building Materials and Structures, Coastal and Harbor Engineering, ...