UNP Journal of Statistics and Data Science
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles
18 Documents
Search results for
, issue
"Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science"
:
18 Documents
clear
Nonparametric Regression with Local Polynomial Kernel on Relationship Between Schooling Years and Unemployment Rate in Banten
Miftahul Barokah, Bunga;
Fadhilah Fitri;
Chairina Wirdiastuti
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/372
The Open Unemployment Rate (TPT) is a key indicator in assessing the economic performance of Banten Province. One of the factors suspected to influence TPT is education, which is measured by the average years of schooling. This study aims to analyze the relationship between the average years of schooling and TPT using the Local Polynomial Kernel Nonparametric Regression method for the period 2017–2024. This method was chosen for its flexibility in modeling nonlinear relationships without requiring strict assumptions about the data. The optimal bandwidth parameter for smoothing was determined using the Direct Plug-In (DPI) method through the dpill function in the R software. The results show that the nonparametric model has a coefficient of determination (R²) of 0.2841, which is higher than that of the Ordinary Least Squares (OLS) linear regression model, which only reached 0.1710. This indicates that the nonparametric approach is better at capturing the complex relationship between education and unemployment. However, the low R² values in both models indicate the presence of other factors that influence the unemployment rate, such as economic conditions, labor market structure, and education policy. Therefore, increasing the average years of schooling alone may not be sufficient to significantly reduce the unemployment rate. More comprehensive policies are needed, such as job skill enhancement, vocational training, and economic strategies focused on job creation. The findings of this study are expected to provide useful insights for policymakers in formulating more effective strategies to address unemployment in Banten Province.
Forecasting Consumer Price Index in Personal Care Sector in Bukittinggi Using SVR with Grid Search and Radial Basis Function Kernel
Pane, khairunnisa;
Fadhilah Fitri;
Dina Fitria
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/373
Inflation, measured by the Consumer Price Index (CPI), is vital for economic stability and policy making. In Bukittinggi, the Personal Care and Other Services sector shows notable CPI fluctuations, complicating accurate forecasting. This study uses Support Vector Regression (SVR) to predict monthly CPI data for this sector from 2020 to 2024. Data from Statistics Indonesia was normalized with Min-Max normalization to improve model accuracy and avoid scale distortion. Lag features were added to capture time dependencies, and data was split into training (80%) and testing (20%) sets. A linear SVR model was first applied but showed limited success due to the data’s non-linear nature. Therefore, the Radial Basis Function (RBF) kernel was used, with hyperparameters (C, sigma, epsilon, folds) optimized via Grid Search and cross-validation. The optimal settings (C=32, sigma=2, epsilon=0.1, k=10) yielded the lowest RMSE of 0.1099 in cross-validation and 0.0767 on testing. Results demonstrate that the RBF-SVR model effectively captures non-linear CPI patterns and outperforms the linear model. Evaluation metrics included RMSE, MSE, and MAE. The study concludes that SVR combined with Grid Search offers a robust forecasting method for sectors with complex CPI behavior, supporting local economic planning in Bukittinggi. Future research could investigate hybrid models and larger datasets to enhance prediction accuracy and adaptability to market changes.
Forecasting Inflation Rate in Indonesia Using Autoregressive Integrated Moving Average Method
Putri, Lathifa;
Zilrahmi
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/377
Inflasi merupakan salah satu indikator penting untuk menilai stabilitas ekonomi suatu negara. Peningkatan inflasi yang terus menerus akan memperlambat pertumbuhan ekonomi. Oleh karena itu, prakiraan tingkat inflasi yang akurat penting untuk perencanaan ekonomi jangka menengah hingga panjang. Penelitian ini dilakukan untuk meramalkan tingkat inflasi di Indonesia selama 12 periode mendatang, yaitu dari Januari 2025 hingga Desember 2025. Penelitian ini menggunakan metode ARIMA, karena model ARIMA bersifat fleksibel terhadap semua jenis pola data deret waktu, meskipun data tersebut bersifat non-stasioner. Hasil penelitian menunjukkan bahwa ARIMA (2,0,2) merupakan model terbaik dengan nilai akurasi MAPE sebesar 25,21%. Model ini dapat memprediksi tingkat inflasi yang stabil di Indonesia selama 12 periode mendatang, dengan rata-rata sebesar 1,861%. Hasil ini menunjukkan bahwa kenaikan harga umum barang dan jasa di Indonesia selama periode tersebut akan stabil tanpa fluktuasi, yang merupakan tanda positif bagi stabilitas makroekonomi dan daya beli masyarakat.
Process Capability Analysis of OPC Cement Production Using Statistical Process Control and IMR Method: Blaine Test Evaluation
Alya Aufa, Wafiq;
Yenni Kurniawati;
Admi Salma;
Darwas
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/379
The main challenge in cement production at PT Semen Padang is maintaining consistent product quality, particularly the fineness of cement particles measured by the Blaine test. Variations in raw materials and the production process can cause fluctuations in quality, which affect the performance of the final product. Therefore, it is crucial to monitor and control process stability and capability to consistently meet product specifications. Based on the Statistical Process Control (SPC) analysis using Individuals and Moving Range (I-MR) control charts on 28 observations of Ordinary Portland Cement (OPC) Blaine values from February 2025, one out-of-control point was detected on the Moving Range chart between observations 16 and 17, indicating a significant variation. However, all points on the Individuals chart remained within control limits, suggesting that the individual process values were still under control. After revising the outlier data, the process was confirmed stable. Process capability analysis showed a Cp value of 2.17 and a Cpk value of 1.98, indicating that the production process is not only statistically stable but also highly capable of meeting quality specifications. Therefore, despite some variation between data points, the cement production process at PT Semen Padang can be considered stable and capable. Nevertheless, periodic evaluations are recommended to maintain consistent product quality and provide strategic recommendations for the Quality Assurance division in implementing data-driven quality control.
Comparison of Nadaraya-Watson Method with Local Polynomial in Modeling HDI and Poverty Relationship in Java Island
Novi, Yoli Marda;
Fadhilah Fitri;
Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/380
Poverty remains a critical issue in Indonesia, with the number of poor people reaching 24.06 million in September 2024. The Human Development Index (HDI), which indicates the level of human resource quality, is one of the factors influence poverty. This analysis focuses on the correlation involving HDI also this number of poor people in districts/cities in Java Island by comparing two kernel regresokesion methods, namely Nadaraya-Watson Estimator and Local Polynomial Estimator. Nonparametric regression was chosen thus it does not necessitate this presumption of a certain form of connection among variables, so it is more flexible in capturing complex relationship patterns. Secondary data from Statistics Indonesia (BPS) in 2024 was used in this study. Initial exploration shows, the data distribution does not have a clear pattern, so nonparametric methods are more suitable for use. Modeling is done using the optimal bandwidth obtained through the dpill function in R software. The analysis results show that the local polynomial estimator produces smoother regression curves and lower MSE values. In addition, comparison of different polynomial degrees shows that higher polynomial degrees tended to improve model performance. Among the tested polynomial degrees, the local polynomial with degree five (p=5) produced the lowest MSE value and the highest coefficient of determination. Therefore, the local polynomial estimator with degree 5 is the best method for modeling the relationship between the HDI and poverty levels in Java in 2024
Penerapan Partial Least Squares dan Pendekatan Robust dalam Analisis Diskriminan untuk Data Berdimensi Tinggi
Rahmadina Adityana;
Vionanda, Dodi;
Permana, Dony;
Fitri, Fadhilah
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/396
Classical discriminant analysis, namely linear discriminant analysis and quadratic discriminant analysis, is generally known to suffer from singularity problems when exprerienced with high-dimensional data and is not robust to outliers that make the data not multivariate normally distributed. This research focuses on investigating the classification performance of discriminant analysis on high-dimensional data by applying two approaches, namely the Partial Least Square (PLS) dimension reduction approach as a solution to high-dimensional data and a robust approach with the Minimum Covariance Determinant (MCD) estimator technique that is robust to outliers. The data used for this study is Lee Silverman Voice Treatment (LSVT) data. PLS forms five optimal latent variables that represent predictor variable information. Based on the assumption test of covariance homogeneity between groups, the test statistic value is greater than the chi-square table or the p-value is smaller than the significance level, which means that the assumption is unfulfilled, so quadratic discriminant analysis is applied. The evaluation results showed that the quadratic discriminant analysis analysis model with the MCD approach on the PLS transformed data was able to achieve 81% accuracy, 71% precision, 86% recall, and 77% F1-score. These values indicate that both approaches are able to maintain the efficiency of discriminant analysis classification performance on high-dimensional and multivariate non-normally distributed data.
Comparison of Kernel and Spline Nonparametric Regression (Case Study: Food Security Index of Jambi Province 2023)
Rosa Salsabila Azarine;
Septrina Kiki Arisandi;
Fadhilah Fitri;
Yenni Kurniawati
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/397
Food security is one of the issues that plays an important role in national development, especially in regions with varying levels of economic welfare such as Jambi Province. One of the main factors affecting food security is food expenditure, which reflects the economic capacity of households to access food. The complex and non-linear relationship between Food Security Index (FSI) and Food Expenditure requires a flexible modeling approach in the analysis. This study aims to compare the performance of nonparametric regression Kernel ans Spline regression methods, namely the Nadaraya-Watson Estimator (NWE) and Local Polynomial Estimator (LPE) for Kernel Regression as well as Smoothing Spline and B-Spline for Spline Regression. The analysis was conducted using secondary data obtained from the Food Security and Vulnerability Map (FSVA) of 2023, with a total of 141 subdistricts in Jambi Province. The response variable is the Food Security Index (FSI), while the predictor variable is Food Expenditure. Model evaluation was conducted using the Mean Squared Error (MSE) and the coefficient of determination (R²). The results showed that the NWE method had the best performance with the smallest MSE value of 24.47690 and the highest R² value of 0.3332, meaning that approximately 33.32% of the variation in FSI could be explained by Food Expenditure. The LPE method showed nearly comparable performance, while Smoothing Spline and B-Spline exhibited higher prediction error rates. Therefore, the NWE method can be recommended as an effective nonparametric regression approach for modeling the relationship between food expenditure and food security.
Comparison of Nadaraya-Watson and Local Polynomial Methods in Analyzing the Relationship Between Consumer Price Index and Inflation in South Kalimantan
Salwa Hifa Fadilah;
Fadhilah Fitri;
Fenni Kurnia Mutiya
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/401
This study compares the performance of two nonparametric regression methods, namely Nadaraya-Watson and Local Polynomial, in analyzing the relationship between the Consumer Price Index (CPI) and inflation in South Kalimantan Province. Nonparametric approaches were chosen for their greater flexibility in capturing nonlinear relationships that conventional parametric models may fail to explain. The data were obtained from the Central Statistics Agency (BPS) for the period from January 2022 to December 2024, with missing values in the inflation variable handled through mean imputation. The optimal bandwidth was selected using the direct plug-in method (dpill).Visually, the Nadaraya-Watson method produced a more fluctuating curve that is highly sensitive to local variations, while the Local Polynomial method yielded a smoother and more stable curve. Quantitatively, the Local Polynomial method demonstrated better performance with lower MSE (0.1839), MAE (0.3507), and a higher R² (0.1811) compared to Nadaraya-Watson. These findings indicate that the Local Polynomial method is more effective in balancing curve flexibility and stability. This study also addresses a methodological gap by highlighting the relevance of nonparametric approaches in regional economic analysis. Future research is encouraged to explore alternative bandwidth selection methods and different kernel functions to improve estimation accuracy.
Applying Robust Spatial Autoregressive Model to Analyze the Determinants of Open Unemployment in West Java
Berliana Nofriadi;
Suci Rahmadani;
Sepniza Nasywa;
Tessy Octavia Mukhti;
Yenni Kurniawati
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/402
Open unemployment is a critical macroeconomic challenge in developing regions like West Java, Indonesia, where spatial disparities and data anomalies complicate traditional analysis. This study addresses these limitations by employing a Robust Spatial Autoregressive (RSAR) model with M-Estimator, integrating spatial dependence and outlier resilience to enhance estimation accuracy. Using 2024 district-level data from Indonesia’s Central Bureau of Statistics (BPS) and Open Data Jabar, the research examines determinants such as labor force participation, education, and regional GDP. The methodology begins with Ordinary Least Squares (OLS) to identify initial predictors, followed by spatial diagnostics (Moran’s I, Lagrange Multiplier tests) to confirm spatial autocorrelation. A customized Queen contiguity weight matrix captures neighborhood effects, while robust M-Estimation mitigates outlier distortions. Results reveal that the RSAR model achieves superior explanatory power (R² = 0.8626) compared to OLS and standard Spatial Autoregressive (SAR) models, with labor force participation (X₄) emerging as a significant negative predictor of unemployment. Spatial effects (ρ = 0.337) though modest, underscore the importance of inter-regional dynamics. The study concludes that RSAR offers a more reliable framework for regional labor analysis, combining spatial rigor with robustness against data irregularities. Policy-wise, the findings advocate targeted interventions to boost labor participation and address localized disparities, emphasizing the need for spatially informed, outlier-resistant methodologies in economic planning.
Comparison of Expectation-Maximization (EM) Algorithm and Kmeans for District/City Clustering in West Sumatera Province Based on Breadfruit Production
Mayrita, Mayrita Addila Putri;
Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol3-iss3/403
Breadfruit (Artocarpus altilis) is an important food source that is highly nutritious and plays a strategic role in West Sumatra Province. However, challenges such as pests, diseases and marketing constraints affect its cultivation and productivity. This study employed K-means and expectation-maximisation (EM) clustering methods to categorise regions according to their breadfruit cultivation characteristics. The elbow method identified three optimal clusters for K-means and seven for EM. Evaluating the quality of the clusters using the silhouette coefficient produced values of 0.47 and 0.37 for EM and K-Means respectively, indicating that EM produced tighter, more distinct clusters. These results suggest that EM is a more effective method for describing the variation in breadfruit production in West Sumatra. With this in mind, the research is expected to inform strategic decision-making aimed at increasing the productivity and added value of breadfruit crops in the area..