Prosiding Seminar Nasional Official Statistics
Vol 2024 No 1 (2024): Seminar Nasional Official Statistics 2024

Pemanfaatan Machine Learning untuk Imputasi Data

Lusmana, Putri Hera (Unknown)
Paramudita, Ayu (Unknown)



Article Info

Publish Date
08 Nov 2024

Abstract

The challenge that is often faced in large-scale company-based surveys is non-response. Non-response is one of the causes of missing data. One way that can be used to deal with missing data is to perform data imputation. So far, data imputation in the Annual Survey of Manufacturing Industries (STPIM) has been carried out by using a combination of various approaches, including historical, auxiliary unit, and clerical imputation. These methods, however, tend to be inefficient and are unable to measure the outcomes’ accuracy. In light of these limitations, we aim to introduce an innovative approach in data imputation processing by utilizing machine learning. By comparing various machine learning methods, we obtained the results that K-nearest neighbors imputation is the best method in terms of accuracy in imputing output data in STPIM 2021. Meanwhile, in terms of computing performance, linear Support Vector Machine (SVM) gives the most efficient processing time.

Copyrights © 2024






Journal Info

Abbrev

semnasoffstat

Publisher

Subject

Humanities Computer Science & IT Economics, Econometrics & Finance Social Sciences

Description

prosiding seminar ini bertujuan untuk menghasilkan berbagai pemikiran solutif, inovatif, dan adaptif terkait isu, strategi, dan metode yang memanfaatkan official ...