Claim Missing Document
Check
Articles

Found 3 Documents
Search

PERBANDINGAN ANALISIS REGRESI LOGISTIK BINER DAN NAÏVE BAYES CLASSIFIER UNTUK MEMPREDIKSI FAKTOR RESIKO DIABETES Aristawidya, Rafika; Indahwati, Indahwati; Erfiani, Erfiani; Fitrianto, Anwar; A. A., Muftih
Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika Vol. 5 No. 2 (2024): Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistik
Publisher : LPPM Universitas Bina Bangsa

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46306/lb.v5i2.617

Abstract

Diabetes is a global health problem that is increasing in prevalence worldwide. This study compares the performance of two data analysis methods, namely binary logistic regression and naïve bayes classifier in predicting diabetes risk. This study aims to identify factors that significantly affect diabetes risk and classify diabetes risk using binary logistic regression, then compare the classification with the naive bayes classifier algorithm. Binary logistic regression models the relationship between independent predictor variables and binary dependent variables, while naïve bayes classifier uses the assumption of independence between variables. In this study, both methods were evaluated based on accuracy, sensitivity, specificity and positive predictive value. The results show that the factors that influence the risk of diabetes are Age, Gender, Polyuria, Polydipsia, Genital thrush, Itching, Irritability, and Partial paresis. Furthermore, the binary logistic regression results have a higher classification accuracy (92.31%) compared to the naïve bayes classifier (84.61%). Therefore, binary logistic regression was identified as the best method to predict diabetes risk in the context of this study
Comparison of Random Forest, XGBoost, and LightGBM Methods for the Human Development Index Classification Indah, Yunna Mentari; Aristawidya, Rafika; Fitrianto, Anwar; Erfiani, Erfiani; Jumansyah, L.M. Risman Dwi
Jambura Journal of Mathematics Vol 7, No 1: February 2025
Publisher : Department of Mathematics, Universitas Negeri Gorontalo

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37905/jjom.v7i1.28290

Abstract

Machine learning classification is an effective tool for categorizing data based on patterns, which is particularly useful in analyzing the Human Development Index (HDI) in Indonesia. HDI serves as a key indicator of regional development progress, making it crucial to classify HDI categories at the regency/city level to support targeted development planning. This study aims to compare the performance of three ensemble-based classification methods—Random Forest, XGBoost, and LightGBM—in classifying HDI categories in Indonesia. Data from the Central Bureau of Statistics (BPS) in 2023, comprising 514 observations across nine variables, was used for analysis. The study applied these algorithms to analyze the most influential variables affecting HDI. The results show that LightGBM outperformed both Random Forest and XGBoost, achieving an accuracy of 0.937 without outlier handling and 0.944 with outlier handling. Additionally, per capita expenditure was identified as the most influential factor in predicting HDI. These findings contribute to the field of statistical modeling by demonstrating how ensemble methods can improve classification accuracy and provide valuable insights for data-driven policymaking, thus enhancing regional development planning and supporting future HDI-related research.
Optimizing Random Forest Parameters with Hyperparameter Tuning for Classifying School-Age KIP Eligibility in West Java Setyowati, Silfiana Lis; Qalbi, Asyifah; Aristawidya, Rafika; Sartono, Bagus; Firdawanti, Aulia Rizki
Jambura Journal of Mathematics Vol 7, No 1: February 2025
Publisher : Department of Mathematics, Universitas Negeri Gorontalo

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37905/jjom.v7i1.28736

Abstract

Random Forest is an ensemble learning algorithm that combines multiple decision trees to generate a more stable and accurate classification model. This study aims to optimize Random Forest parameters for classifying school-age students' eligibility for the Kartu Indonesia Pintar (KIP) in West Java, based on economic factors. The research uses secondary data from the 2023 National Socio-Economic Survey (SUSENAS) of West Java, with a sample size of 13,044 individuals. To address class imbalance, Synthetic Minority Oversampling Technique (SMOTE) is applied. Hyperparameter tuning through grid search identifies the optimal combination of parameters, including the number of trees (ntree), random variables per split (mtry), and terminal node size (node_size). Model performance is evaluated using balanced accuracy, sensitivity, and specificity. Results indicate that the optimal parameters (mtry = 5, ntree = 674, node_size = 26) yield a balanced accuracy of 65.47%. Significant variables include PKH status, floor area of the house, source of drinking water, and building material type. The model accurately identifies students in need of educational assistance. In conclusion, optimizing Random Forest parameters improves the accuracy of KIP eligibility classification, supporting educational equity policies in West Java. These findings provide a foundation for developing more effective beneficiary selection systems for educational aid.