Claim Missing Document
Check
Articles

Found 13 Documents
Search

ANALISIS KINERJA MODEL STACKING BERBASIS RANDOM FOREST DAN SVM DALAM KLASIFIKASI RUMAH TANGGA BERDASARKAN GARIS KEMISKINAN MAKANAN DI PROVINSI JAWA BARAT Ghiffary, Ghardapaty Ghaly; Amanda, Nabila Tri; Ardhani, Rizky; Sartono, Bagus; Firdawanti, Aulia Rizki
Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika Vol. 5 No. 3 (2024): Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistik
Publisher : LPPM Universitas Bina Bangsa

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46306/lb.v5i3.856

Abstract

The stacking method is an ensemble technique in machine learning that combines predictions from several base models to improve classification accuracy. This research applies the stacking method with two machine learning algorithms, namely Random Forest and Support Vector Machine (SVM) as base learners and logistic regression as a meta learner. This study aims to develop a classification model to identify households based on the food poverty line in West Java Province. The data used is KOR and household data in West Java Province sourced from the 2023 BPS National Socio-Economic Survey (Susenas). The variables used consisted of 24 independent variables with food poverty level as the response variable. Modeling was conducted using feature selection using Recursive Feature Elimination (RFE) and class imbalance handling using the ADASYN method. The results showed that the stacking model was superior to the single model with a balance accuracy of 0.81, sensitivity of 0.72, and specificity of 0.89. Feature importance analysis identified that calorie consumption, expenditure on cigarettes, meat and fruits, and expenditure on rice, eggs and other commodities contributed the most to the classification households based on the food poverty line in West Java Province.
STUDI KOMPARASI METODE SVM-SMOTE DAN SMOTE-TOMEK DALAM MENGATASI IMBALANCE CLASS MENGGUNAKAN MODEL XGBOOST PADA KLASIFIKASI RUMAH TANGGA PENERIMA KUR Yanuari, Eka Dicky Darmawan; Yudhianto, Rachmat Bintang; Ulfia, Ratu Risha; Sartono, Bagus; Firdawanti, Aulia Rizki
Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika Vol. 5 No. 3 (2024): Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistik
Publisher : LPPM Universitas Bina Bangsa

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46306/lb.v5i3.857

Abstract

This study aims to compare the SMOTE, SVM-SMOTE, and SMOTE-Tomek methods using the XGBoost model in overcoming the problem of class imbalance and to determine the factors that affect the status of KUR recipients in West Java Province. Three XGBoost models with class balancing techniques SMOTE, SVM-SMOTE and SMOTE-Tomek were applied to SUSENAS data of West Java Province in 2023 consisting of 1 response variable and 19 predictor variables. The results showed that the XGBoost model with the SMOTE balancing method produced better accuracy in overall data classification, but was less effective in classifying minority classes as reflected by low sensitivity and F1-Score values. The XGBoost model with the SMOTE-Tomek balancing method showed better performance in capturing minority classes with higher sensitivity and F1-Score values. The most influential variables in this model in order are per capita expenditure, urban/rural classification, motorcycle ownership, dwelling wall materials and land ownership. Per capita expenditure has the largest influence on the classification of KUR recipients, indicating that household financial management is a major factor in lending decisions. Urban/rural classification and motorcycle ownership also contributed significantly, reflecting differences in social and economic access between regions. Overall, economic factors, infrastructure and social accessibility are the main considerations in determining KUR recipient households in West Java Province.
A Analisis Perbandingan Kinerja Metode Ensemble Bagging dan Boosting pada Klasifikasi Bantuan Subsidi Listrik di Kabupaten/Kota Bogor Cintari, Nanda Putri; Alifviansyah, Kevin; Tsabitah, Dhiya Ulayya; Sartono, Bagus; Firdawanti, Aulia Rizki
The Indonesian Journal of Computer Science Vol. 13 No. 6 (2024): The Indonesian Journal of Computer Science (IJCS)
Publisher : AI Society & STMIK Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33022/ijcs.v13i6.4537

Abstract

The classification of electricity subsidy recipients is an crucial step to ensure that the government's social assistance program is distributed in a targeted manner, so an appropriate analysis method is needed. This research compares the Bagging and Boosting ensemble methods for the classification of households receiving electricity subsidies in Bogor Regency and City using Susenas 2023 data totaling 2002 households. The bagging method uses Random Forest and Extra Trees, while boosting includes CatBoost and LightGBM. The results showed that the Extra Trees method of bagging provided the best performance with 91% accuracy, 95% F1score, and 97% sensitivity. Factors such as ownership of electronic goods and modern facilities, such as ownership of air conditioners, laptops, and televisions are the most significant variables in influencing the classification of electricity subsidy recipients. With high accuracy and minimal bias, this model effectively supports data-driven policies for electricity subsidy distribution. This research is expected to be a strategic recommendation for the government to improve the effectiveness of the electricity subsidy program to be more efficient, well-targeted, and support the improvement of people's welfare.
PERBANDINGAN ALGORITMA RANDOM FOREST DAN XGBOOST DALAM KLASIFIKASI PENERIMA BANTUAN PANGAN NON-TUNAI (BPNT) DI PROVINSI JAWA BARAT Yulianti, Riska; Ilmani, Erdanisa Aghnia; Waliulu, Megawati Zein; Sartono, Bagus; Firdawanti, Aulia Rizki
Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika Vol. 6 No. 1 (2025): Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistik
Publisher : LPPM Universitas Bina Bangsa

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46306/lb.v6i1.850

Abstract

This study compares the performance of Random Forest and XGBoost algorithms in classifying recipients of Non-Cash Food Assistance (BPNT) in West Java Province. The data used is from the 2023 National Socio-Economic Survey (SUSENAS) comprising 25,890 households, with 23.6% BPNT recipients and 76.4% non-recipients. The study includes data exploration, preprocessing, handling class imbalance, baseline modeling, and hyperparameter tuning using Grid Search. The results indicate that undersampling effectively increases the recall of Random Forest to 80.01% and XGBoost to 74.04%, albeit at the expense of accuracy. The most influential variables in classification include the head of household's employment status, flooring material of the house, and type of land/building ownership proof. These findings support the utilization of data-driven algorithms to enhance the accuracy and fairness of BPNT distribution.
Evaluation of Machine Learning Models in Classifying Women's Labor Force Participation in West Java Siregar, Indra Rivaldi; Pratiwi, Windy Ayu; Nugraha, Adhiyatma; Sartono, Bagus; Firdawanti, Aulia Rizki
Techno.Com Vol. 24 No. 1 (2025): Februari 2025
Publisher : LPPM Universitas Dian Nuswantoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62411/tc.v24i1.11945

Abstract

This study compares four classification models—Logistic Regression, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost)—to predict women's labor force participation in West Java, using a dataset of 62 features. After feature selection, the dataset was reduced to 31 features, followed by modeling with the top 10 most important features from each model. Model performance, evaluated using Balanced Accuracy, F1-Score, and Cohen’s Kappa, showed similar results, with RF and XGBoost slightly outperforming the others. However, the differences were not significant, indicating comparable predictive ability across models. The top 10 features from each model were averaged, and the five most influential features were selected. Key factors influencing women's employment status include household responsibilities, age, education, district minimum wage, and the age of the youngest child. The analysis found that 79.6% of unemployed women manage household duties, while employed women are less involved (18.9%). Age was significant, with employed women mostly in the 35-55 age range, correlating with older children and greater workforce participation. Additionally, employed women are more likely to come from regions with lower minimum wages, suggesting that economic necessity drives their labor market participation. Keywords: female labor force, machine learning, classification, West Java
Optimizing Random Forest Parameters with Hyperparameter Tuning for Classifying School-Age KIP Eligibility in West Java Setyowati, Silfiana Lis; Qalbi, Asyifah; Aristawidya, Rafika; Sartono, Bagus; Firdawanti, Aulia Rizki
Jambura Journal of Mathematics Vol 7, No 1: February 2025
Publisher : Department of Mathematics, Universitas Negeri Gorontalo

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37905/jjom.v7i1.28736

Abstract

Random Forest is an ensemble learning algorithm that combines multiple decision trees to generate a more stable and accurate classification model. This study aims to optimize Random Forest parameters for classifying school-age students' eligibility for the Kartu Indonesia Pintar (KIP) in West Java, based on economic factors. The research uses secondary data from the 2023 National Socio-Economic Survey (SUSENAS) of West Java, with a sample size of 13,044 individuals. To address class imbalance, Synthetic Minority Oversampling Technique (SMOTE) is applied. Hyperparameter tuning through grid search identifies the optimal combination of parameters, including the number of trees (ntree), random variables per split (mtry), and terminal node size (node_size). Model performance is evaluated using balanced accuracy, sensitivity, and specificity. Results indicate that the optimal parameters (mtry = 5, ntree = 674, node_size = 26) yield a balanced accuracy of 65.47%. Significant variables include PKH status, floor area of the house, source of drinking water, and building material type. The model accurately identifies students in need of educational assistance. In conclusion, optimizing Random Forest parameters improves the accuracy of KIP eligibility classification, supporting educational equity policies in West Java. These findings provide a foundation for developing more effective beneficiary selection systems for educational aid.
Evaluasi Kinerja Model Random Forest dan LightGBM untuk Klasifikasi Status Imunisasi Hepatitis B (HB-0) pada Balita Syam, Ummul Auliyah; Irdayanti, Irdayanti; Magfirrah, Indah; Sartono, Bagus; Firdawanti, Aulia Rizki
Euler : Jurnal Ilmiah Matematika, Sains dan Teknologi Volume 13 Issue 1 April 2025
Publisher : Universitas Negeri Gorontalo

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37905/euler.v13i1.29762

Abstract

Hepatitis B (HB-0) immunization in infants is an important step in preventing the transmission of hepatitis B from an early age and improving public health. This study aims to classify the HB-0 immunization status of infants in West Java Province. The method used is the Random Forest and LightGBM algorithms. The research results showed that the Random Forest model had a balanced accuracy of 0.8443, which was slightly higher than LightGBM (0.8357). This indicated that Random Forest performed better in classifying the HB-0 immunization status of infants in West Java Province, accurately distinguishing between those who received and did not receive the immunization without bias toward either class. The global analysis using the Random Forest model identified six feature importance that contributed the most to the model’s performance: BCG immunization status, ownership of the KIA/KMS book, mother’s age, household head’s age, age at first pregnancy, and regency or city classification of residence. The feature importance analysis using SHAP for the first observation showed that BCG immunization status, ownership of the KIA/KMS book, and regency or city classification of residence increased the likelihood of infants receiving immunization. Conversely, the number of children (4), mother’s age (37 years), and household head’s age (40 years) increased the likelihood of infants not receiving immunization. This study is expected to provide data-driven insights for the government to design more effective interventions to improve immunization coverage and child health in Indonesia while also supporting the achievement of global health targets.
Analisis Klasifikasi Kesiapan Digital Desa Menggunakan Decision Tree dan Pemetaan Spasial Fatimah Ahmad, Hafidlotul; Firdawanti, Aulia Rizki; Agustiani, Nur
Bulletin of Computer Science Research Vol. 5 No. 5 (2025): August 2025
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bulletincsr.v5i5.741

Abstract

Digital transformation at the village level is a strategic element in promoting equitable development and improving public service delivery. However, the level of digital readiness across regions remains uneven. This study aims to classify the digital readiness of villages in West Java Province by utilizing data from Open Data Jabar (opendata.jabarprov.go.id) related to the number of digital villages, internet access, and village development strata. A Decision Tree classification algorithm was employed to categorize regions into two readiness classes: high and low. The modeling results indicate that the number of self-reliant (mandiri) villages and the percentage of villages with internet access are the most influential variables in the classification. Although internet infrastructure is available in most areas, it does not always correspond to the level of village digitalization. Districts with high internet access but a low number of self-reliant villages are still classified as having low readiness. The model achieved an accuracy of 83%, although its performance in identifying the high readiness class was limited due to class imbalance in the dataset. Spatial visualization was also used to highlight regional disparities in digital readiness. This study provides an early contribution to digital readiness mapping of villages using a machine learning approach in Indonesia.
A Hybrid Decision Tree and K-Means Approach for Classifying Community Happiness in Bogor Regency Anwar Fajar Rizki; Dwi Fitrianti; Sri Amaliya; Bagus Sartono; Aulia Rizki Firdawanti
Statistika Vol. 25 No. 2 (2025): Statistika
Publisher : Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Islam Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29313/statistika.v25i2.5590

Abstract

Abstract. Happiness is one of the key indicators for measuring the quality of life in a community. This study aims to classify the level of happiness among residents of Bogor Regency using a hybrid approach that combines Decision Trees and K-means. The research procedure consisted of data preprocessing, clustering using K-Means to form preliminary groups, and further classification through a Decision Tree to interpret the determinants of happiness. The analysis revealed that the residents of Bogor Regency can be categorized into two groups: those who are fairly happy and those who are less happy. The hybrid model achieved its best performance with a balanced accuracy of 84%, an F1-Score of 37%, and a Kappa score of 28%. Socioeconomic factors, such as marital status, family status, occupation, and the number of cigarettes smoked, were identified as the primary determinants influencing happiness levels. The main contribution of this study lies in demonstrating the effectiveness of a hybrid Decision Tree–K-Means approach for happiness classification and providing interpretable insights that are directly useful for policymakers. These findings offer strategic implications for the local government to design more inclusive socioeconomic policies that aim to enhance happiness and overall well-being among the residents of Bogor Regency.
Analisis Perbandingan Kinerja Algoritma K-Means dan K-Medoids dengan Reduksi Dimensi PCA pada Indikator Kesehatan dan Sosial Firdawanti, Aulia Rizki; Ahmad, Hafidlotul Fatimah; Agustiani, Nur
Bulletin of Computer Science Research Vol. 5 No. 5 (2025): August 2025
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bulletincsr.v5i5.742

Abstract

Public health in West Java faces complex challenges, including disparities in healthcare access, malnutrition, and socio-economic inequalities across districts. These conditions require data-driven analysis to identify patterns of disparity and provide evidence-based guidance for policy intervention. This study aims to cluster districts/cities in West Java based on health and social indicators using Principal Component Analysis (PCA) for dimensionality reduction, followed by K-Means and K-Medoids algorithms for clustering. Data from 27 districts/cities during 2019–2024 were analyzed after standardization. PCA extracted two principal components explaining 61.4% of the total variance. Scree plot and silhouette results indicated three optimal clusters. Comparative analysis revealed that the average silhouette score of K-Means was 0.31, while K-Medoids achieved a higher score of 0.34, suggesting more stable and robust partitioning against outliers. In 2024, Cluster 1 consisted of regions with adequate healthcare facilities and lower prevalence of underweight children; Cluster 2 grouped regions with limited health infrastructure and higher malnutrition problems, while Cluster 3 showed intermediate conditions. Therefore, K-Medoids outperformed K-Means by producing more consistent clustering across years. These findings offer practical recommendations: Cluster 2 should be prioritized for interventions such as improving primary healthcare access and nutrition programs, Cluster 1 requires maintenance of service quality, and Cluster 3 should be targeted for gradual reinforcement.