Indonesian Journal of Statistics and Its Applications
Vol 6 No 1 (2022)

Implementation of Ensemble Self-Organizing Maps for Missing Values Imputation

Titin Siswantining (Department of Mathematics, Universitas Indonesia)
Kathan Gerry Vivaldi (Universitas Indonesia)
Devvi Sarwinda (Department of Mathematics, Universitas Indonesia)
Saskya Mary Soemartojo (Department of Mathematics, Universitas Indonesia)
Ika Mattasari (Universitas Indonesia)
Herley Al-Ash (Universitas Indonesia)



Article Info

Publish Date
31 May 2022

Abstract

The purpose of this study is to implement the ensemble self-organizing maps (E-SOM) method to impute missing values at the preprocessing data stage, which is an important stage when making predictions or classifications. The Ensemble Self-Organizing Maps (E-SOM) is the development of the SOM imputation method, in which the E-SOM method is implemented by applying an ensemble framework using several SOMs to improve generalization capabilities. In this study, the E-SOM imputation method is implemented in South African heart disease data using random forest as a classification model. The results of the model evaluation showed that for accuracy in testing data, the Random Forest model formed from E-SOM imputed data yields better accuracy values than the Random Forest model formed from SOM-imputed data for variations of 36, 49, 64, and 81 neurons, while for variation of 25 neurons both models produce the same accuracy value. From the variation of the number of ensembles applied, the E-SOM imputation method with a combination of 81 neurons and 15 ensemble numbers produced a Random Forest model with the most optimal value of accuracy.

Copyrights © 2022






Journal Info

Abbrev

ijsa

Publisher

Subject

Computer Science & IT Mathematics Other

Description

Indonesian Journal of Statistics and Its Applications (eISSN:2599-0802) (formerly named Forum Statistika dan Komputasi), established since 2017, publishes scientific papers in the area of statistical science and the applications. The published papers should be research papers with, but not limited ...