Claim Missing Document
Check
Articles

Found 4 Documents
Search

PERBANDINGAN METODE HOT-DECK IMPUTATION DAN METODE KNNI DALAM MENGATASI MISSING VALUES Iman Jihad Fadillah; Siti Muchlisoh
Seminar Nasional Official Statistics Vol 2019 No 1 (2019): Seminar Nasional Official Statistics 2019
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (587.447 KB) | DOI: 10.34123/semnasoffstat.v2019i1.101

Abstract

Salah satu ciri data statistik yang berkualitas adalah completeness. Namun, pada penyelenggaraan sensus atau survei, sering kali ditemukan masalah data hilang atau tidak lengkap (missing values), tidak terkecuali pada data Survei Sosial Ekonomi Indonesia (Susenas). Berbagai masalah dapat ditimbulkan oleh missing values. Oleh karena itu, masalah missing values harus ditangani. Imputasi adalah cara yang sering digunakan untuk menangani masalah ini. Terdapat beberapa metode imputasi yang telah dikembangkan untuk menangani missing values. Hot-deck Imputation dan K-Nearest Neighbor Imputation (KNNI) merupakan metode yang dapat digunakan untuk menangani masalah missing values. Metode Hot-deck Imputation dan KNNI memanfaatkan variabel prediktor untuk melakukan proses imputasi dan tidak memerlukan asumsi yang rumit dalam penggunaannya. Algoritma dan cara penanganan missing values yang berbeda pada kedua metode tentunya dapat menghasilkan hasil estimasi yang berbeda pula. Penelitian ini membandingkan metode Hot-deck Imputation dan KNNI dalam mengatasi missing values. Analisis perbandingan dilakukan dengan melihat ketepatan estimator melalui nilai RMSE dan MAPE. Selain itu, diukur juga performa komputasi melalui penghitungan running time pada proses imputasi. Implementasi kedua metode pada data Susenas Maret Tahun 2017 menunjukkan bahwa, metode KNNI menghasilkan ketepatan estimator yang lebih baik dibandingkan Hot-deck Imputation. Namun, performa komputasi yang dihasilkan pada Hot-deck Imputation lebih baik dibandingkan KNNI.
PEMANFAATAN METODE WEIGHTED K-NEAREST NEIGHBOR IMPUTATION (WEIGHTED KNNI) UNTUK MENGATASI MISSING DATA Iman Jihad Fadillah; Chaterina Dwi Puspita
Seminar Nasional Official Statistics Vol 2020 No 1 (2020): Seminar Nasional Official Statistics 2020
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (208.516 KB) | DOI: 10.34123/semnasoffstat.v2020i1.409

Abstract

By 2020, almost all countries in the world will face the COVID-19 outbreak, including Indonesia. One of the impacts that occurred due to the COVID-19 pandemic was the obstruction of statistical activities, such as delayed or stopped carrying out survey and census data collection and other data collection. Meanwhile, to meet data demands and needs during the COVID-19 pandemic, the national statistical agencies must continue to collect data and provide statistical data. Therefore, the national statistical agency must adapt to the census and survey process activities carried out, such as finding alternative data collection modes, reducing sample sizes, modifying sample designs, reducing question items in questionnaires, or others. Based on this description, the adaptation of census / survey data collection activities carried out during the COVID-19 pandemic will affect the quality of the data produced. One of them is missing data. To solve the problem of missing data, one method that can be used is data imputation. One type of machine learning-based imputation method that is often used is Weighted K-Nearest Neighbor Imputation (Weighted KNNI). The Weighted KNNI method has better accuracy than the other two imputation methods (Unweighted KNNI and Mean Imputation) for each percentage of missing data, both the accuracy from the RMSE side and the accuracy from the MAPE side. Based on these results, seen from its accuracy, the KNNI Weighted method can be used as a solution to dealing with incomplete data during the current COVID19 pandemic
Perbandingan Hot-deck, SVM, dan Random Forest dalam Mengidentifikasi Industri Mikro dan Kecil Terdampak Covid-19 Tahun 2020 Iman Jihad Fadillah; Lalu Moh. Arsal Fadila; Lalu Muhamad Winadi Darundiye
Seminar Nasional Official Statistics Vol 2022 No 1 (2022): Seminar Nasional Official Statistics 2022
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (350.248 KB) | DOI: 10.34123/semnasoffstat.v2022i1.1235

Abstract

The spread of Covid-19 has been declared a pandemic since March 2020. The pandemic coupled with policies by the government resulted in a decline in the economic sector, especially in micro and small industries (IMK). Identifying IMK affected by the Covid-19 pandemic is an important step. There are two types of identification methods that are commonly used, namely statistical-based methods and machine learning-based methods. Each method has different measurement results. Therefore, an appropriate method is needed to identify IMKs affected by the Covid-19 pandemic. This study aims to compare the hot-deck, SVM and random forest methods, in order to obtain the best method to identify IMK affected by Covid-19. The results obtained are that the random forest method is the best method in identifying IMK affected by Covid-19.
Application of the Random forest Method to Identify Food and Beverage Industries Experiencing Raw Material Difficulties : Penerapan Metode Random Forest untuk Mengidentifikasi Industri Makanan dan Minuman yang Mengalami Kesulitan Bahan Baku Iman Jihad Fadillah; Indah Noor Safrida; Rima Kusumaningtyas
Indonesian Journal of Statistics and Applications Vol 8 No 1 (2024)
Publisher : Statistics and Data Science Program Study, IPB University, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v8i1p37-46

Abstract

The food and beverage industry experienced a significant increase after the pandemic. However, challenges continue to hit this industry, especially for micro and small scale businesses. To overcome this problem, the right approach is needed. One of the first steps is to provide quality data as a basis for decision making and problem solving. However, statistical activities such as censuses and surveys often face obstacles in the form of missing values. One effective method for dealing with this is using the random forest method. This research aims to use a machine learning-based imputation method, namely the random forest method, to identify micro and small scale food and beverage industries that are experiencing raw material difficulties. The research results show that the random forest method provides accurate and consistent predictions in identifying food and beverage industries experiencing raw material difficulties. However, it is also necessary to consider the relatively long computing time for implementing this method.