Claim Missing Document
Check
Articles

Found 21 Documents
Search

Premarital Sex Behavior Model with Lasso Generalized Linear Mixed Model and Group Lasso Generalized Linear Mixed Model Khalilah Nurfadilah; Asfar; Khairil A. Notodiputro; Bagus Sartono; Azlam Nas
Statistika Vol. 23 No. 1 (2023): Statistika
Publisher : Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Islam Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29313/statistika.v23i1.1953

Abstract

ABSTRACT Premarital sexual behavior is sexual behavior that is carried out between men and women without legal marriage. As the number of premarital sex increases, efforts need to take. One that can do is to identify the main factors contributing to reducing or increasing premarital sex behavior by a Regression model. In the context of sexual behavior, environmental influences cannot be ignored. GLMM is used to model data that is grouped into certain Groups, include environment effect that is modeled as mixed effect in GLMM. In terms of parsimony, the LASSO method can do selection variables. This research uses GLMM LASSO and GLMM Group LASSO as a model to approach the data. The best model that describes premarital sex behavior in South Sulawesi is the GLMM Group LASSO model based on the greatest AUC value. The variables that significantly influence the model are Type of Residence (X_1), Education Level (X_2), Literacy (X_3), Internet use (X_4), Knowledge of Contraceptive Methods (X_6), Health Insurance Ownership (X_7), Employment Status (X_8), Knowledge of Sexually Transmitted Diseases (X_9). By knowing the factors that influence premarital sex behavior, the government is expected to take the appropriate action for handling it.
GLMM and GLMMTree for Modelling Poverty in Indonesia Suseno Bayu; Khairil Anwar Notodiputro; Bagus Sartono
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.333

Abstract

GLMMTree is a tree-based algorithm that can detect interaction and find subgroups in the GLMM to improve fixed effect estimation. This study uses GLMMTree for the actual data applications of poverty in Indonesia and confirms that the GLMMTree algorithm method has better precision than GLMM. The significant predictors that affect poverty in Indonesia are the unemployment rate and the GRDP at a constant price. GLMMTree algorithm enriches the analysis by finding subgroups of provinces with electricity lighting access and clean drinking water sources variables.
Cost-Sensitive Boosting Algorithm for Classifying Underdeveloped Regions in Indonesia Bayu Suseno; Bagus Sartono; Khairil Anwar Notodiputro
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.373

Abstract

Imbalanced classes are indicated by having more instances of some classes than others. The cost-sensitive boosting algorithm is a modification of the AdaBoost algorithm, which aims to solve the problem of imbalanced classes. In this study, we evaluate the cost-sensitive Boosting algorithm AdaC2 using Indonesia's underdeveloped region's data. This study confirms that the cost-sensitive boosting algorithm (AdaC2) performs better in classifying the instances in the minority classes than standard classifiers algorithms.
Classification of household poverty in West Java using the generalized mixed-effects trees model FARDILLA RAHMAWATI; KHAIRIL ANWAR NOTODIPUTRO; KUSMAN SADIK
Jurnal Natural Volume 23 Number 3, October 2023
Publisher : Universitas Syiah Kuala

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24815/jn.v23i3.33079

Abstract

Dealing with fixed effects and random effects can be accomplished by combining statistical modeling and machine learning techniques. This paper discusses the modeling of fixed effects and random effects using a statistical machine-learning approach. We used the generalized mixed-effects trees (GMET), a tree-based mixed-effect model for dealing with response variables that belong to the exponential family of distributions. In this study, both simulation and actual/empirical data utilized the GMET method to discover data conditions that were appropriate for employing this approach. The simulation data was generated using different response variable generations, as well as different values of the variance of random effect and fixed effect coefficients. The findings indicated that the GMET performs similarly for different response variable generation scenarios. However, it performed better when the fixed effect value and the variance of random effects were large. When applied to the empirical data, the GMET method describes fixed effects and random effects and classifies household poverty status quite well based on the area under curve (AUC) value. It has also revealed that important variables for poverty classification are the number of household members, owning land, the type of main fuel used for cooking, and the main source of water used for drinking. In order to address the socioeconomic disparity that leads to poverty, the government may become concerned about these factors. In addition to that information, the use of regional typology as a random effect in the model has also contributed to the variation of household poverty status. Based on research, the fixed effects in mixed models do not need to be linear and GMET may be employed in grouped data structures, giving the GMET technique the ability to compete with other approaches/methods.
BETA-BINOMIAL MODEL IN SMALL AREA ESTIMATION USING HIERARCHICAL LIKELIHOOD APPROACH Etis Sunandi; Khairil Anwar Notodiputro; Indahwati Indahwati; Agus Mohamad Soleh
MEDIA STATISTIKA Vol 16, No 1 (2023): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.16.1.88-99

Abstract

Small Area Estimation is a statistical method used to estimate parameters in sub-populations with small or even no sample sizes. This research aims to evaluate the Beta-Binomial model's performance for estimating small areas at the area level. The estimation method used is Hierarchical Likelihood (HL). The data used are simulation data and empirical data. Simulation studies were used to investigate the proposed model. The estimator's Mean Squared Error of Prediction (MSEP) and Absolute Bias (AB) estimator values determine the best estimation criteria. An empirical study using data on the illiteracy rate at the sub-district level in Bengkulu Province. The results of the simulation study show that, in general, the parameter estimators are nearly unbiased. Proportion prediction has the same tendency as parameters. Finally, the HL estimator has a small MSEP estimator. The results of an empirical study show that the average illiteracy rate in Bengkulu province is quite diverse. Kepahiang District has the highest average illiteracy rate in Bengkulu Province in 2021.
Effectiveness of GPCA in Reducing Data Dimensions and its Application to Human Development Dimension Indicators Data Zubedi, Fahrezal; Sumertajaya, I Made; Notodiputro, Khairil Anwar; Syafitri, Utami Dyah
Inferensi Vol 7, No 3 (2024)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v7i3.21506

Abstract

Analysis of human development growth at the regency/city level is challenging because the data is high-dimensional, indicators are correlated, and the regencies/cities are correlated. In this study, we propose a Generalized Principal Component Analysis to analyze human development growth by reducing the dimensions of regency/city and indicator. Thus, human development growth at the regency/city level is analyzed using the GPCA results in Biplot to describe each regency/city and its indicators. This study aims to evaluate GPCA in reducing the dimensionality of data whose observations are correlated, and indicators are correlated through simulation and empirical study; to analyze the growth of human development at the regency/city level based on the results of GPCA-Biplot. This research shows that GPCA works well in reducing data dimensions from correlated observations and correlated variables. Based on the results of the GPCA-Biplot visualization, the growth of human development in the Nduga regency from 2019 to 2022 showed significant fluctuations. Although some indicators show progress, especially in 2021, significant challenges remain. In the same way, the growth of human development in each regency/city can be analyzed. Thus, government policy focuses on real problems in the field.
Comparison of GMERF and GLMM Tree Models on Poverty Household Data with Imbalanced Categories Bukhari, Ari Shobri; Notodiputro, Khairil Anwar; Indahwati, Indahwati; Fitrianto, Anwar
Inferensi Vol 8, No 2 (2025)
Publisher : Department of Statistics ITS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j27213862.v8i2.21901

Abstract

Decision tree and forest methods have become popular approaches in data science and continue to evolve. One of these developments is the combination of decision trees with Generalized Linear Mixed Models (GLMM), resulting in the GLMM Tree, which is applicable to multilevel and longitudinal data. Another model, Generalized Mixed Effect Random Forest (GMERF), extends the concept of decision forests with GLMM, effectively handling complex data structures with non-linear interactions. This study compares the performance of GLMM Tree and GMERF models in classifying poor households in South Sulawesi Province, characterized by imbalanced categories. GLMM Tree provides a simple, interpretable classification through tree diagrams, while GMERF highlights variable importance. Initial tests show all three models (GLMM, GLMM Tree, and GMERF) achieve high accuracy and specificity but exhibit low sensitivity. By applying oversampling, sensitivity and AUC are significantly improved, though this is accompanied by a decline in accuracy and specificity, revealing a trade-off. The study concludes that while GLMM, GLMM Tree and GMERF have their strengths, using them together offers a more comprehensive understanding of poverty classification. Handling imbalanced data with oversampling is effective in increasing sensitivity, but careful consideration is needed due to its impact on overall accuracy.
Choosing the Right Tool: Practical Considerations for GLMM and GEE in Longitudinal Studies, with a Focus on Data Challenges Sihombing, Pardomuan Robinson; Erfiani, Erfiani; Notodiputro, Khairil Anwar; Kurnia, Anang
ZERO: Jurnal Sains, Matematika dan Terapan Vol 9, No 1 (2025): Zero: Jurnal Sains Matematika dan Terapan
Publisher : UIN Sumatera Utara

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30829/zero.v9i1.24602

Abstract

The proposed research systematically reviews the comparative issues between GLMM and GEE for longitudinal data. The review discusses the competing arguments regarding the practical strengths and weaknesses of the two arrests. Empirical evidence demonstrates that GLMM generally provides subject-specific estimates and performs better than GEE in hierarchical and individual variance. In contrast, GEE provides resilient population-level findings, which are crucial for policy. The choice of method depends on the data structure and scope of inference. GLMM is consistently better when characterizing individuals, for example, in studies where we assume random effects are drawn from a complex distribution. GEEs shine most brightly in large datasets, obtaining robust population-level estimates even when the working correlation is misspecified. Finally, the results provide hands-on recommendations for researchers from various domains who apply statistical models to longitudinal studies to select solid, context-fitting statistical models for long-term studies.
MULTILEVEL REGRESSIONS FOR MODELING MEAN SCORES OF NATIONAL EXAMINATIONS Nurfadilah, Khalilah; Aidi, Muhammad Nur; Notodiputro, Khairil A.; Susetyo, Budi
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 18 No 1 (2024): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol18iss1pp0323-0332

Abstract

National Exam known as UN score is the final evaluation to determine the achievement of national graduate competency standards in the school. The determinants of the achievement of the standards can’t be separated from the role of schools and local governments in which this regard is known as nested. In the field of statistics, this phenomenon can be described with a multilevel model, where level-1 is the school while level-2 is the district where the school is located. Several multilevel models are used to describe the phenomenon, the result shows that the two-level regression model without interaction is selected as the best model and the variables which affect the UN average scores significantly at level-1 are school status , the ratio between laboratories and students , while the variable at level-2 is expenditure per capita of district/city . From this study, that educational institutions' steps in achieving a graduation standard can be right on the target.
Stacking Ensemble RNN-LSTM Models for Forecasting the IDR/USD Exchange Rate with Nonlinear Volatility Pratiwi, Windy Ayu; Sumertajaya , I Made; Notodiputro , Khairil Anwar
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 4 (2025): JUTIF Volume 6, Number 4, Agustus 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.4.5057

Abstract

Abstract - Predicting exchange rates with high volatility and nonlinear patterns presents a critical challenge in financial analysis. Deep learning models such as RNN and LSTM are widely used for their ability to capture temporal dependencies, yet each has limitations when applied individually. This study aims to enhance the prediction accuracy of the Indonesian Rupiah (IDR) to US Dollar (USD) exchange rate by implementing a stacking ensemble approach that combines RNN and LSTM models. The dataset consists of 522 weekly observations from January 2015 to December 2024, sourced from the official website of Bank Indonesia (bi.go.id). In the proposed framework, RNN and LSTM serve as base learners, while linear regression acts as the meta-learner. Model performance is evaluated using RMSE, MAPE, and MSE. The results indicate that the stacking ensemble consistently outperforms the individual models, achieving an RMSE of 117.91, a MAPE of 0.01, and an MSE of 13,901.67. The model effectively captures historical patterns and delivers stable and accurate predictions. In conclusion, the stacking ensemble approach developed in this study contributes to the advancement of ensemble learning techniques in computer science and offers practical value for financial decision-makers, particularly in managing complex and dynamic exchange rate scenarios.