Hari Yanni, Meri
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Performance Comparison of Random Forest, Bagging, and CART Methods in Classifying Recipients of the Family Program in North Aceh Hari Yanni, Meri; Anwar Notodiputro, Khairil; Sartono, Bagus
Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika Vol. 11 No. 1 (2025): April 2025
Publisher : Universitas Muhammadiyah Surakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.23917/khif.v11i1.5098

Abstract

Machine learning is a method in data mining, it is used to study large data patterns through classification methods including Random Forest, Bagging, and CART. The Random Forest method develops the Bagging technique and Decision Tree components (CART) in decision-making. The difference between RF and Bagging is the selection of random features in forming a decision tree. It is only found in RF. Bagging can improve performance, model stability, and reduce variance by forming many different models. The research aims to see the performance of the Random Forest, Bagging, and CART methods in classifying family recipient programs in North Aceh. The results show that the performance of the RF, Bagging, and CART classification methods using the SMOTE technique for handling unbalanced classes is better than before handling unbalanced data. The classification method is evaluated through each model's accuracy, sensitivity, specificity, precision, F1 score, and AUC values. The results show good performance with accuracy values of 90% Smote-RF and 86% Smote Bagging. The best performance was seen in the Smote-RF model which was obtained by tuning the Grid Search CV model parameters with k = 5 and repeat = 1 for a data set proportion of 90:10. This shows that the model can correctly predict all observations with an accuracy percentage of 90% with an average AUC value of 93.52%. On the other hand, the CART method has a very low accuracy value, so the model is less able to accurately predict all observations. Measurement of the level of importance of predictor variables that have the greatest influence in predicting recipient households is the floor area of the house, the number of household members aged 10 years and over, and the type of work of the head of the household.