Claim Missing Document
Check
Articles

Found 31 Documents
Search

Premarital Sex Behavior Model with Lasso Generalized Linear Mixed Model and Group Lasso Generalized Linear Mixed Model Khalilah Nurfadilah; Asfar; Khairil A. Notodiputro; Bagus Sartono; Azlam Nas
Statistika Vol. 23 No. 1 (2023): Statistika
Publisher : Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Islam Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29313/statistika.v23i1.1953

Abstract

ABSTRACT Premarital sexual behavior is sexual behavior that is carried out between men and women without legal marriage. As the number of premarital sex increases, efforts need to take. One that can do is to identify the main factors contributing to reducing or increasing premarital sex behavior by a Regression model. In the context of sexual behavior, environmental influences cannot be ignored. GLMM is used to model data that is grouped into certain Groups, include environment effect that is modeled as mixed effect in GLMM. In terms of parsimony, the LASSO method can do selection variables. This research uses GLMM LASSO and GLMM Group LASSO as a model to approach the data. The best model that describes premarital sex behavior in South Sulawesi is the GLMM Group LASSO model based on the greatest AUC value. The variables that significantly influence the model are Type of Residence (X_1), Education Level (X_2), Literacy (X_3), Internet use (X_4), Knowledge of Contraceptive Methods (X_6), Health Insurance Ownership (X_7), Employment Status (X_8), Knowledge of Sexually Transmitted Diseases (X_9). By knowing the factors that influence premarital sex behavior, the government is expected to take the appropriate action for handling it.
GLMM and GLMMTree for Modelling Poverty in Indonesia Suseno Bayu; Khairil Anwar Notodiputro; Bagus Sartono
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.333

Abstract

GLMMTree is a tree-based algorithm that can detect interaction and find subgroups in the GLMM to improve fixed effect estimation. This study uses GLMMTree for the actual data applications of poverty in Indonesia and confirms that the GLMMTree algorithm method has better precision than GLMM. The significant predictors that affect poverty in Indonesia are the unemployment rate and the GRDP at a constant price. GLMMTree algorithm enriches the analysis by finding subgroups of provinces with electricity lighting access and clean drinking water sources variables.
Cost-Sensitive Boosting Algorithm for Classifying Underdeveloped Regions in Indonesia Bayu Suseno; Bagus Sartono; Khairil Anwar Notodiputro
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.373

Abstract

Imbalanced classes are indicated by having more instances of some classes than others. The cost-sensitive boosting algorithm is a modification of the AdaBoost algorithm, which aims to solve the problem of imbalanced classes. In this study, we evaluate the cost-sensitive Boosting algorithm AdaC2 using Indonesia's underdeveloped region's data. This study confirms that the cost-sensitive boosting algorithm (AdaC2) performs better in classifying the instances in the minority classes than standard classifiers algorithms.
A Comparative Study of Random Forest and Double Random Forest Models from View Points of Their Interpretability Adlina Khairunnisa; Khairil Anwar Notodiputro; Bagus Sartono
Scientific Journal of Informatics Vol 11, No 1 (2024): February 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v11i1.48721

Abstract

Purpose: This study aims to compare the performance of ensemble trees such as Random Forest (RF) and Double Random Forest (DRF) from view points of interpretability of the models. Both models have strong predictive performance but the inner working of the models is not human understandable. Model interpretability is required to explain the relationship between the predictors and the response. We apply association rules to simplify the essence of the models.Methods: This study compares interpretability of RF and DRF using association rules. Each decision tree formed from each model is converted into if-then rules by following the path from root node to leaf nodes. The data was selected in such a way that they were underfit data. This is due to the fact that DRF has been shown by other researchers to overcome the underfitting problem faced by RF. A Simulation study has been conducted to evaluate the extracted rules from RF and DRF. The rules extracted from both models are compared in terms of model interpretability based on support and confidence values. Association rules may also be applied to identify the characteristics of poor people who are working in Yogyakarta.Result: The simulation results revealed that the interpretability of DRF outperformed RF especially in the case of modelling underfit data.  On the other hand, using empirical data we have been able to characterize the profile of poor people who are working in Yogyakarta based on the most frequent rules.Novelty: Research on interpretable DRF is still rare, especially the interpretation model using association rules. Previous studies focused only on interpreting the random forest model using association rules. In this study, the rules extracted from the random forest and double random forest models are compared based on the quality of the rules extracted.
Classification of household poverty in West Java using the generalized mixed-effects trees model FARDILLA RAHMAWATI; KHAIRIL ANWAR NOTODIPUTRO; KUSMAN SADIK
Jurnal Natural Volume 23 Number 3, October 2023
Publisher : Universitas Syiah Kuala

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24815/jn.v23i3.33079

Abstract

Dealing with fixed effects and random effects can be accomplished by combining statistical modeling and machine learning techniques. This paper discusses the modeling of fixed effects and random effects using a statistical machine-learning approach. We used the generalized mixed-effects trees (GMET), a tree-based mixed-effect model for dealing with response variables that belong to the exponential family of distributions. In this study, both simulation and actual/empirical data utilized the GMET method to discover data conditions that were appropriate for employing this approach. The simulation data was generated using different response variable generations, as well as different values of the variance of random effect and fixed effect coefficients. The findings indicated that the GMET performs similarly for different response variable generation scenarios. However, it performed better when the fixed effect value and the variance of random effects were large. When applied to the empirical data, the GMET method describes fixed effects and random effects and classifies household poverty status quite well based on the area under curve (AUC) value. It has also revealed that important variables for poverty classification are the number of household members, owning land, the type of main fuel used for cooking, and the main source of water used for drinking. In order to address the socioeconomic disparity that leads to poverty, the government may become concerned about these factors. In addition to that information, the use of regional typology as a random effect in the model has also contributed to the variation of household poverty status. Based on research, the fixed effects in mixed models do not need to be linear and GMET may be employed in grouped data structures, giving the GMET technique the ability to compete with other approaches/methods.
BETA-BINOMIAL MODEL IN SMALL AREA ESTIMATION USING HIERARCHICAL LIKELIHOOD APPROACH Etis Sunandi; Khairil Anwar Notodiputro; Indahwati Indahwati; Agus Mohamad Soleh
MEDIA STATISTIKA Vol 16, No 1 (2023): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.16.1.88-99

Abstract

Small Area Estimation is a statistical method used to estimate parameters in sub-populations with small or even no sample sizes. This research aims to evaluate the Beta-Binomial model's performance for estimating small areas at the area level. The estimation method used is Hierarchical Likelihood (HL). The data used are simulation data and empirical data. Simulation studies were used to investigate the proposed model. The estimator's Mean Squared Error of Prediction (MSEP) and Absolute Bias (AB) estimator values determine the best estimation criteria. An empirical study using data on the illiteracy rate at the sub-district level in Bengkulu Province. The results of the simulation study show that, in general, the parameter estimators are nearly unbiased. Proportion prediction has the same tendency as parameters. Finally, the HL estimator has a small MSEP estimator. The results of an empirical study show that the average illiteracy rate in Bengkulu province is quite diverse. Kepahiang District has the highest average illiteracy rate in Bengkulu Province in 2021.
Pengaruh Pemberian Salep Chlorella vulgaris Terhadap Penyembuhan Luka Sayatan pada Mencit (Mus musculus albinus) Sri Wahyuni; Khairil Anwar Notodiputro; Sachnaz Desta Oktarina; Laily Nissa Atul Mualifah
Jurnal Veteriner dan Biomedis Vol. 2 No. 1 (2024): Maret
Publisher : Sekolah Kedokteran Hewan dan Biomedis

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/jvetbiomed.2.1.16-21.

Abstract

Penelitian ini bertujuan untuk mengetahui pengaruh salep Chlorella vulgaris terhadap proses penyembuhan luka sayatan mencit (Mus musculus albinus) berdasarkan waktu yang dibutuhkan untuk menyembuhkan luka dan perubahan morfologi luka dibandingkan kontrol. Penelitian ini menggunakan metode Rancangan Acak Lengkap (RAL) dengan menggunakan 25 ekor mencit sebagai hewan uji yang dibagi menjadi 5 kelompok yaitu; 3 kelompok perlakuan (C. vulgaris salep 5%, C. vulgaris salep 10%, C. vulgaris salep 15%) dan 2 kelompok kontrol (plasebo dan proses penyembuhan normal). Mencit dilukai dengan scalpel-blade sepanjang 1 cm sampai fascia. Luka diolesi salep C. vulgaris dua kali sehari dan diamati setiap hari dari hari ke 1 sampai hari ke 14. Semua data kuantitatif diuji secara statistik menggunakan ANOVA dan data kualitatif disajikan secara deskriptif. Hasil penelitian menunjukkan bahwa terdapat perbedaan yang signifikan pada 5 kelompok (P>0,05). Terdapat perbedaan antara kelompok perlakuan (C. vulgaris salep 5%, C. vulgaris salep 10%, C. vulgaris salep 15%) dan kelompok kontrol. Hasilnya salep C. vulgaris berpengaruh terhadap proses penyembuhan luka sayatan mencit (M. m. albinus) dibandingkan kelompok kontrol dengan kandungan ekstrak C. vulgaris 10% paling baik untuk menyembuhkan luka dengan cepat.
Effectiveness of GPCA in Reducing Data Dimensions and its Application to Human Development Dimension Indicators Data Zubedi, Fahrezal; Sumertajaya, I Made; Notodiputro, Khairil Anwar; Syafitri, Utami Dyah
Inferensi Vol 7, No 3 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i3.21506

Abstract

Analysis of human development growth at the regency/city level is challenging because the data is high-dimensional, indicators are correlated, and the regencies/cities are correlated. In this study, we propose a Generalized Principal Component Analysis to analyze human development growth by reducing the dimensions of regency/city and indicator. Thus, human development growth at the regency/city level is analyzed using the GPCA results in Biplot to describe each regency/city and its indicators. This study aims to evaluate GPCA in reducing the dimensionality of data whose observations are correlated, and indicators are correlated through simulation and empirical study; to analyze the growth of human development at the regency/city level based on the results of GPCA-Biplot. This research shows that GPCA works well in reducing data dimensions from correlated observations and correlated variables. Based on the results of the GPCA-Biplot visualization, the growth of human development in the Nduga regency from 2019 to 2022 showed significant fluctuations. Although some indicators show progress, especially in 2021, significant challenges remain. In the same way, the growth of human development in each regency/city can be analyzed. Thus, government policy focuses on real problems in the field.
PENERAPAN ANALISIS LASSO DAN GROUP LASSO DALAM MENGIDENTIFIKASI FAKTOR-FAKTOR YANG BERHUBUNGAN DENGAN TUBERKULOSIS DI JAWA BARAT Stephan Chen; Khairil Anwar Notodiputro; Septian Rahardiantoro
Indonesian Journal of Statistics and Applications Vol 4 No 1 (2020)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v4i1.510

Abstract

Tuberculosis is the deadliest infectious disease in Indonesia, and West Java is a province with the largest number of tuberculosis cases in Indonesia. This research was conducted to identify variables and groups of variables that could explain the number of tuberculosis cases in West Java. The data used has many explanatory variables, and these variables form groups. LASSO and group LASSO analysis can be used for variables selection and handle data that has many explanatory variables, and group LASSO analysis can be used on data with grouped variables. The results of the LASSO analysis, variables that can explain the number of tuberculosis cases in West Java are the number of people with disabilities, the number of pharmacy staff, the number of malnourished people, the number of people working and the number of cities. According to the group LASSO analysis, the variables that can explain the number of tuberculosis cases in West Java are variables in the health and environmental groups. The government can focus on these factors if they want to reduce the number of tuberculosis cases in West Java.
A REPEATED CROSS-SECTIONAL MODEL FOR ANALYZING UNEMPLOYMENT DATA IN BOGOR Ulfah Sulistyowati; Khairil Anwar Notodiputro; I Made Sumertajaya
Indonesian Journal of Statistics and Applications Vol 4 No 2 (2020)
Publisher : Departemen Statistika, IPB University dengan Forum Perguruan Tinggi Statistika (FORSTAT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v4i2.513

Abstract

In general, the form of data encountered in statistical problems is panel data and cross-sectional data. There are times in certain conditions, the data formed in the form of a combination of panel data with cross-sectional data, which is commonly referred to as repeated cross-sectional data. Repeated cross-sectional data is often done in research with individual observations. In this study, a repeated cross-sectional analysis was carried out using a fixed influence model with observations in the form of an area (village) in Bogor, West Java to analyze unemployment factors. The results obtained are that ongoing village development affects the unemployment rate in Bogor