This study aims to identify key factors influencing high school dropout rates in Indonesia by applying advanced statistical modeling that accounts for complex data characteristics. Dropout data often display overdispersion (variability greater than expected) and excess zeros (many students not dropping out), which, if ignored, can bias conclusions. To address this, we compare parametric models, Zero-Inflated Poisson Mixed Model (ZIPMM), Zero-Inflated Generalized Poisson Mixed Model (ZIGPMM), and Zero-Inflated Negative Binomial Mixed Model (ZINBMM), with their semiparametric counterparts (SZIPMM, SZIGPMM, SZINBMM). The semiparametric models use B-spline functions to capture nonlinear relationships between predictors and dropout rates, with flexibility. Model performance was evaluated using Akaike Information Criterion (AIC) and Root Mean Square Error (RMSE) across 100 simulation repetitions to ensure robustness. Results show that the semiparametric ZIGPMM (SZIGPMM) outperformed other models, achieving the lowest average AIC (18969.62), suggesting the best trade-off between model fit and complexity. The optimal spline configuration used knot point 2 and order 3, with a Generalized Cross-Validation (GCV) score of 9.4107. Key predictors of dropout include school status (public or private), student-teacher ratio, distance from home to school, parental education level, parental employment status, and number of siblings. These findings provide actionable insights for education policymakers, emphasizing the need to address structural and socioeconomic barriers to reduce dropout rates effectively.
Copyrights © 2025