UNP Journal of Statistics and Data Science
Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science

Penanganan Ketidakseimbangan Multikelas pada Dataset Survei Kerangka Sampel Area menggunakan Metode SCUT

Sondriva, Wilia (Unknown)
Kurniawati, Yenni (Unknown)
Amalita, Nonong (Unknown)
Salma, Admi (Unknown)



Article Info

Publish Date
31 May 2024

Abstract

Area Sampling Frame (ASF) is a survey used by the Indonesian government to measure rice productivity in Indonesia. ASF survey is important data because accurate and high-quality rice productivity data is highly needed. There is extreme imbalance in the ASF survey data, thus requiring handling of this imbalance. SMOTE and Cluster-based Undersampling Technique (SCUT) is a method that can be used to address the dataset imbalance. SCUT combines oversampling using SMOTE and undersampling using CUT. The results from SCUT show that the number of data points in each class becomes balanced. Subsequently, a two-sample mean test is conducted to observe the mean differences between the original dataset and the dataset after handling. The results show that in the early vegetative, late vegetative, and harvest phases, the means are significantly similar between the original dataset and the dataset after handling, but in the generative phase, the means are not significantly similar. Therefore, synthetically generated data using the SCUT method generally exhibit similar mean characteristics.

Copyrights © 2024






Journal Info

Abbrev

ujsds

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Mathematics Social Sciences

Description

UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its ...