Claim Missing Document
Check
Articles

Found 23 Documents
Search

Evaluating Different K Values in K-Fold Cross Validation for Binary Logistic Regression to Classify Poverty Sinaga, Julia Oriana; Fathurahman, M.; Wahyuningsih, Sri; Hayati, Memi Nor
Jurnal Varian Vol. 8 No. 2 (2025)
Publisher : Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/varian.v8i2.4403

Abstract

Data mining is essential for decision-makers to analyze and extract insights from data efficiently. Classification is one of the data mining techniques used to organize data based on its features, helping to identify patterns and make predictions. This study evaluates Binary Logistic Regression (BLR), a type of generalized linear model that suitable for binary outcomes, for classifying poverty depth across Indonesian regencies/cities in 2022, with a focus on the impact of different K values in K-Fold Cross Validation. The dataset includes 514 regencies/cities, with the Poverty Depth Index as the target variable, categorized into high (1) and low (0) levels, using 11 predictor variables. K-Fold Cross Validation was performed with K values of 3, 5, and 10, using accuracy and Area Under Curve (AUC) as evaluation metrics. The mean accuracy values for BLR are 75.7% for K=3, 74.3% for K=5, and 75.1% for K=10. Results show that K=3 offers the highest accuracy in classifying poverty depth in Indonesia, with the lowest standard deviation of 0.03. However, K=10 demonstrates superior discriminative ability in BLR, reflected by a higher AUC value. This study highlights the significant influence of K values in K-Fold Cross Validation on BLR performance.
Mengeksplorasi Masalah Kejahatan dari POV Statistik dengan Regresi Binomial Negatif Dani, Andrea Tri Rian; Fathurahman, M.; Ni'matuzzahroh, Ludia; Putri Permata, Regita; Putra, Fachrian Bimantoro
Jurnal Varian Vol. 8 No. 2 (2025)
Publisher : Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/varian.v8i2.4445

Abstract

Criminality is a complex issue in Indonesia that is very important to the government, law enforcement agencies, and society. The underlying causes of Indonesia's crime problem are complex and impacted by various circumstances. The aim of this research is to model the crime problem in Indonesia and determine the influencing factors.  The method used in this research is Negative Binomial Regression. The results of the study show that the negative binomial regression model can be used to model criminal problems because the variance value is more significant than the average. Based on the parameter significance test results, both simultaneously and partially, the open unemployment rate, Gini ratio, average years of schooling, and prevalence of inadequate food consumption significantly affect the crime rate, with an Akaike’s Information Criterion Corrected (AICc) value of 698,098. These findings suggest that addressing economic inequality, unemployment, education, and food security could help reduce crime in Indonesia. Policies aimed at improving job opportunities, reducing income disparity, and enhancing education and food security are crucial in mitigating crime. This study provides valuable insights for policymakers and law enforcement agencies, offering a foundation for more targeted and effective crime prevention strategies. Future research could employ the robust Poisson Inverse Gaussian Regression method to avoid the overdispersion problem. 
INVERSE GAUSSIAN REGRESSION MODELING AND ITS APPLICATION IN NEONATAL MORTALITY CASES IN INDONESIA Fathurahman, M.
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 16 No 4 (2022): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (416.488 KB) | DOI: 10.30598/barekengvol16iss4pp1197-1206

Abstract

Inverse Gaussian Regression (IGR) is a suitable model for modeling positively skewed response data, which follows the inverse Gaussian distribution. The IGR model was formed from the Generalized Linear Models (GLM). This study aims to model the IGR with applied to model the factors influencing the infant mortality cases of provinces in Indonesia. Estimation of the IGR model parameters was employed by the Maximum Likelihood Estimation (MLE) and Fisher scoring methods. The Likelihood Ratio Test (LRT) and Wald test were used for hypothesis testing of significance parameters. The IGR model was applied to the infant mortality cases of provinces in Indonesia in 2020. The data for modeling infant mortality cases using IGR were obtained from the Ministry of Health of the Republic of Indonesia and the Central Bureau of Statistics. The result shows that the factors influencing the infant mortality cases of provinces in Indonesia based on the IGR model were: the percentage of pregnant women who received blood-boosting tablets, the percentage of low birth weight, the percentage of complete neonatal visits (KN3), the percentage of toddlers who received early initiation of breastfeeding, the percentage of toddlers who are exclusively breastfeeding, the percentage of toddlers who received complete primary immunization, the percentage of households with access to adequate drinking water, and the percentage of households with access to appropriate sanitation.