Claim Missing Document
Check
Articles

Comparison of Kernel and Spline Nonparametric Regression (Case Study: Food Security Index of Jambi Province 2023) Rosa Salsabila Azarine; Septrina Kiki Arisandi; Fadhilah Fitri; Yenni Kurniawati
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss3/397

Abstract

Food security is one of the issues that plays an important role in national development, especially in regions with varying levels of economic welfare such as Jambi Province. One of the main factors affecting food security is food expenditure, which reflects the economic capacity of households to access food. The complex and non-linear relationship between Food Security Index (FSI) and Food Expenditure requires a flexible modeling approach in the analysis. This study aims to compare the performance of nonparametric regression Kernel ans Spline regression methods, namely the Nadaraya-Watson Estimator (NWE) and Local Polynomial Estimator (LPE) for Kernel Regression as well as Smoothing Spline and B-Spline for Spline Regression. The analysis was conducted using secondary data obtained from the Food Security and Vulnerability Map (FSVA) of 2023, with a total of 141 subdistricts in Jambi Province. The response variable is the Food Security Index (FSI), while the predictor variable is Food Expenditure. Model evaluation was conducted using the Mean Squared Error (MSE) and the coefficient of determination (R²). The results showed that the NWE method had the best performance with the smallest MSE value of 24.47690 and the highest R² value of 0.3332, meaning that approximately 33.32% of the variation in FSI could be explained by Food Expenditure. The LPE method showed nearly comparable performance, while Smoothing Spline and B-Spline exhibited higher prediction error rates. Therefore, the NWE method can be recommended as an effective nonparametric regression approach for modeling the relationship between food expenditure and food security.
Comparison of Nadaraya-Watson and Local Polynomial Methods in Analyzing the Relationship Between Consumer Price Index and Inflation in South Kalimantan Salwa Hifa Fadilah; Fadhilah Fitri; Fenni Kurnia Mutiya
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss3/401

Abstract

This study compares the performance of two nonparametric regression methods, namely Nadaraya-Watson and Local Polynomial, in analyzing the relationship between the Consumer Price Index (CPI) and inflation in South Kalimantan Province. Nonparametric approaches were chosen for their greater flexibility in capturing nonlinear relationships that conventional parametric models may fail to explain. The data were obtained from the Central Statistics Agency (BPS) for the period from January 2022 to December 2024, with missing values in the inflation variable handled through mean imputation. The optimal bandwidth was selected using the direct plug-in method (dpill).Visually, the Nadaraya-Watson method produced a more fluctuating curve that is highly sensitive to local variations, while the Local Polynomial method yielded a smoother and more stable curve. Quantitatively, the Local Polynomial method demonstrated better performance with lower MSE (0.1839), MAE (0.3507), and a higher R² (0.1811) compared to Nadaraya-Watson. These findings indicate that the Local Polynomial method is more effective in balancing curve flexibility and stability. This study also addresses a methodological gap by highlighting the relevance of nonparametric approaches in regional economic analysis. Future research is encouraged to explore alternative bandwidth selection methods and different kernel functions to improve estimation accuracy.
Comparison of Expectation-Maximization (EM) Algorithm and Kmeans for District/City Clustering in West Sumatera Province Based on Breadfruit Production Mayrita, Mayrita Addila Putri; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss3/403

Abstract

Breadfruit (Artocarpus altilis) is an important food source that is highly nutritious and plays a strategic role in West Sumatra Province. However, challenges such as pests, diseases and marketing constraints affect its cultivation and productivity. This study employed K-means and expectation-maximisation (EM) clustering methods to categorise regions according to their breadfruit cultivation characteristics. The elbow method identified three optimal clusters for K-means and seven for EM. Evaluating the quality of the clusters using the silhouette coefficient produced values of 0.47 and 0.37 for EM and K-Means respectively, indicating that EM produced tighter, more distinct clusters. These results suggest that EM is a more effective method for describing the variation in breadfruit production in West Sumatra. With this in mind, the research is expected to inform strategic decision-making aimed at increasing the productivity and added value of breadfruit crops in the area..
Panel Data Model Selection and Significant Determinants of New Family Planning Participants in West Sumatra Diah Triwulandari; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss3/404

Abstract

Population issues in Indonesia are not limited to poverty, urbanization, population explosion, or high birth rates, but also include how small families can improve and maintain their quality of life. The main objective of the Family Planning program is to create happy and prosperous families with an ideal number of children. The West Sumatra Provincial Health Office report (2023) emphasizes that increasing the number of new family planning acceptors is an important priority to support the success of maternal, child, and family planning health programs, in line with the 2020–2024 RPJMN policy direction. Therefore, this study aims to develop the best panel data model and identify the factors that significantly influence the number of new family planning participants in West Sumatra Province. The secondary data used were obtained from the Statistics Indonesia (BPS) publication entitled West Sumatra Province in Figures from 2021 to 2024. The observation units in this study were 19 districts/cities in West Sumatra Province with a time series from 2020 to 2023. The results indicate that the best-selected model is the random effect model, with the number of couples of reproductive age proven to have a significant effect on the number of new family planning participants. The R-square value of 53.11% indicates that the model can explain 53.11% of the variation in the dependent variable, while the remaining 46.89% is influenced by other factors not included in the model.  
Logit and Complementary Log-Log Modeling in the Case of Factors Affecting Heart Failure Disease MAWARNI, IGA; Asyifa Dwi Ayshah; Dhiyaa Fitri Yafe; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/421

Abstract

Heart failure is one of the leading causes of morbidity and mortality globally. Heart disease is a disease caused by plaque that builds up in the coronary arteries that supply oxygen to the heart muscle. Research on heart failure disease aims to find out what factors affect heart failure disease and how much influence it has. This test was conducted using logistic regression method with logit modeling and complementary log-log modeling in analyzing data of 918 patients with heart failure disease. This study also takes which modeling is the best. The results of this analysis indicate that Age, Gender, Blood Sugar, and Chest Pain have significant effects on the likelihood of Heart Failure. Specifically, higher blood sugar levels and the presence of chest pain were found to increase the probability of heart failure, while gender and age showed varying effects across different age groups. Based on the model comparison, the Logit model demonstrated better fit and predictive accuracy than the Complementary Log-Log model, as reflected by its lower AIC value 897.43.
Penalized Spline Regression Modeling on the Human and Cultural Development Index (IPMK) for 2022 Mila, Sarmilah; Fadhilah Fitri; Musthafa Imran
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/425

Abstract

Human and cultural development is a multidimensional phenomenon whose relationship with socioeconomic factors is often complex and nonlinear, making it challenging to model with conventional parametric approaches. This study aims to model the influence of socioeconomic variables on the Human and Cultural Development Index (IPMK) across 34 provinces in Indonesia in 2022 using the nonparametric Penalized Spline (P-spline) regression method within a Generalized Additive Model (GAM) framework. Secondary data from the Central Statistics Agency (BPS) were used, with predictor variables including School Participation Rate (APS), percentage of access to safe drinking water, Gini Ratio, per capita expenditure, average years of schooling (RLS), and open unemployment rate (TPT). Initial data exploration via scatterplots confirmed nonlinear relationship patterns between the predictor variables and IPMK. The best model was obtained using a first-order cubic spline with 10 knot points, selected based on the minimum Generalized Cross Validation (GCV) criterion. The modeling results demonstrated excellent performance, with an Adjusted R² value of 0.842 and a Deviance Explained of 92.3%. Significance analysis indicated that access to safe drinking water, per capita expenditure, average years of schooling, and the open unemployment rate significantly influence IPMK. Visual interpretation of the significant spline curves revealed informative relationship patterns, such as the diminishing returns effect of per capita expenditure. This study concludes that the P-spline approach is effective and interpretable for modeling complex nonlinear relationships in development data, providing a richer evidence base for policy formulation.
Analisis Pengaruh Penggunaan ChatGPT Terhadap Prestasi Akademik Mahasiswa Dengan Motivasi Sebagai Variabel Intervening Menggunakan Metode SEM-PLS Salsabilla Khairani; Yenni Kurniawati; Dony Permana; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/430

Abstract

This study aims to analyze the factors that influence student academic achievement through the use of ChatGPT using the Structural Equation Modeling (SEM) method based on the Partial Least Square (PLS) approach. In this study, three main factors were identified as elements that can influence the use of ChatGPT, namely knowledge about ChatGPT (PTC), willingness to use the technology (KUMT), and concerns that may arise (KYDT), as well as learning motivation as an intervening variable. The total sampling method was used in this study, where the entire population that met the criteria was designated as respondents. The research population included students in the Statistics Study Program at Padang State University in semesters 4–8 who had used ChatGPT for at least six months, with a total of 216 student respondents. Data were collected through a survey using an online questionnaire. Based on the analysis that has been carried out, the results of the study show that the variables of knowledge about ChatGPT (PTC) and willingness to use the technology (KUMT) have a significant positive effect on learning motivation, while concerns that may arise (KYDT) have no significant effect. Furthermore, only the variable of concerns that may arise (KYDT) had a significant direct effect on academic achievement, while the results of the mediation effect test showed that only the variable of willingness to use the technology (KUMT) had a significant indirect effect on academic achievement through learning motivation.