cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 236 Documents
Cluster Analysis of Earthquakes on the Island of Sumatera in 2024 Using the DBSCAN Method Zahrani Asyati Zulika; Yenni Kurniawati
UNP Journal of Statistics and Data Science Vol. 4 No. 1 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss1/466

Abstract

Earthquakes are one of the most destructive and unpredictable natural disasters. Sumatera Island, being located along the Semangko Fault, typically experiences seismic movement due to contact between the Indo-Australian plate and the Eurasian plate. In this study, the DBSCAN method classifies earthquake incidents in Sumatera in 2024 into magnitude and depth categories. The data set, collected by the Meteorology, Climatology, and Geophysics Agency (BMKG), includes 163 earthquake events that occurred in Sumatera Island during 2024. The clustering process identified two main clusters: one representing deep earthquakes in inland areas and another consisting of shallow earthquakes along the western offshore region, near the megathrust zone. The Silhouette Coefficient was used to verify the clustering outcome, and the result was 0,58, which verifies a good formation of clusters. These findings provide insights into seismic patterns in Sumatera and can support disaster mitigation efforts.
Monthly Rainfall Forecasting in Pesisir Selatan Regency Using the Autoregressive Integrated Moving Average (ARIMA) Model Nisa Ulhusna; Sulistiowati Dwi; Fitri Fadhilah
UNP Journal of Statistics and Data Science Vol. 4 No. 1 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss1/468

Abstract

Rainfall is a climate variable that plays a crucial role in agricultural planning, water resource management, and hydrometeorological disaster mitigation. Therefore, a forecasting method capable of adequately describing the temporal patterns of rainfall data is required. This study aims to forecast monthly rainfall in Pesisir Selatan Regency using the Autoregressive Integrated Moving Average (ARIMA) method. The data used in this study are monthly rainfall data for the period 2015–2024. The analysis stages include missing data imputation, Box–Cox transformation, stationarity testing using the Augmented Dickey–Fuller (ADF) test, model identification through ACF and PACF plots, parameter estimation, and model evaluation based on the Akaike Information Criterion (AIC), residual diagnostic tests, and forecasting accuracy using Mean Absolute Percentage Error (MAPE). The results show that the ARIMA(0,1,1) model is the best model, as indicated by the lowest AIC value and residuals that satisfy the white noise assumption. The forecasting accuracy evaluation yields a MAPE value of 55.05%, indicating that the model’s ability to capture monthly rainfall variability is still limited. Rainfall forecasting for the period January to December 2025 produces relatively constant forecast values, reflecting the limitations of the ARIMA(0,1,1) model in representing seasonal variations. Therefore, this model is more suitable as a baseline approach for rainfall forecasting in Pesisir Selatan Regency. Future studies are recommended to apply models that incorporate seasonal components or external variables to improve forecasting accuracy.
Analysis Analysis of The Influence of Job Resources and Leadership Quality on Job Satisfaction Using Structural Equation Modeling Azizah Apriyerni; Nisa Ulhusna; Rahmadani; Mira Meilisa
UNP Journal of Statistics and Data Science Vol. 4 No. 1 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss1/469

Abstract

Job Satisfaction is a essential factor influencing employee performance, commitment, and organizational sustainability. Low levels of Job Resources and suboptimal Leadership Quality are common causes of decreased job satisfaction across various institutions. This study aims to analyze the effect of job resources and leadership quality on Jjob Satisfaction using the Structural Equation Modeling (SEM) method. The research data were obtained from a Likert-scale survey (1-8) consisting of three latent variabless and their respective indicators, and wer analyzed through Confirmatory Factor Analysis (CFA) and Structural Model assesment. The result of the CFA indicate that all indicators meet the criteria for validity and reliability, with factor loadings above 0.50, a Composite Reliability (CR) value of 0.9667, and an Average Variance Extracted (AVE) value of 0.6769. the Goodness of Fit evaluation shows that the final model is highly acceptable, as reflected by a low Chi-square/df value, RMSEA = 0.005, and CFI, TLI, GFI, and NFI value of 1.000. the Structural analysis further demonstrates that Job Resources have a positive and significant impact on Job Satisfaction. Simultaneously, both variables contribute significantly to explaining variations in Job Satisfaction. This study highlights that enhancing Job Resources and improving Leadership Quality are crucial strategies to strengthen employee Job Satisfaction. The findings provide empirical insight that can assist organizations in developing more effective and sustainable human resource management policies
A Predicting the Future: A Forecast of Bukittinggi's Original Local Revenue from 1996 to 2024 Fedisha Elfiri Fedisha; Fadhilah Fitri; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/473

Abstract

In the past decade, Bukittinggi City’s locally generated revenue (PAD) has experienced considerable instability. A significant decline occurred during the 2020 pandemic, followed by external disruptions such as the 2024 Mount Marapi eruption. These conditions complicate regional financial planning and highlight the importance of reliable forecasting. This study aims to forecast PAD for the 2025–2029 period using the ARIMA (Autoregressive Integrated Moving Average) method. Annual data from 1996–2024 were obtained from official publications of Indonesia’s Central Bureau of Statistics (BPS) Bukittinggi. The analysis procedure included exploratory data analysis, variance stationarity testing using Box-Cox transformation, mean stationarity testing through the Augmented Dickey-Fuller test supported by ACF and PACF plots, tentative model identification, parameter estimation, residual diagnostics using the Ljung-Box and Shapiro-Wilk tests, and model selection based on the smallest MAPE value. The results showed that the data became stationary after Box-Cox transformation and second-order differencing. Among the candidate models, ARIMA(3,2,0) was selected as the best model because all parameters were statistically significant (p-value < 0.05), the residuals satisfied the white noise assumption, and the model produced the lowest MAPE value. Forecasting results indicate an increasing PAD trend from approximately 240.23 million Rupiah in 2025 to 429.57 million Rupiah in 2029. However, prediction intervals widened over time, indicating increasing uncertainty in long-term forecasts. Therefore, the local government should implement adaptive fiscal policies and strengthen regional revenue sources to anticipate future PAD fluctuations
Random Forest Algorithm Implementation for Air Quality Classification in DKI Jakarta Based on ISPU Khairanisa Salsabila; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/474

Abstract

Air quality is an essential factor that has a direct impact on human health. High concentrations of air pollutants have the potential to cause various health impacts, across short-term and long-term horizons. This study aims to classify air quality in DKI Jakarta using the Air Pollution Standard Index (ISPU) data via the random forest algorithm. The dataset covers a timeframe from 2021 to 2025 and includes air pollutant parameters, namely PM10 and PM2.5 particulate matter, carbon monoxide (CO), nitrogen dioxide (NO2), sulfur dioxide (SO2), dan ozone (O3). The research method employs a supervised learning approach, in which the data are stratified and evakuated through the implementation of K-Fold Cross Validation (k = 10) to ensure objective and stable model performance. Model performance was measured using Accuracy, Precision, Recall, and F1-Score metrics, along with Confusion Matrix and Feature Importance analyses. It can be seen from the results that the Random Forest model can classify air quality categories with excellent performance, reaching 100% Accuracy on training data and 98.44% on testing data. The Confusion Matrix analysis indicates that most data in each air quality are correctly classified. Furthermore, the Feature Importance analysis reveals PM2.5 that is most influential parameter in determining air quality categories. Therefore, this study indicates that the Random Forest algorithm proves effective for air quality classificati and can function as a decision-support tool for air pollution control and management in DKI Jakarta.
Factors Affecting Turnover Intention: A Survival Analysis Approach with the Stratified Cox Model Reihan Dani Eka Saputra; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/475

Abstract

The phenomenon of employee resignation or turnover in Indonesia has reached a critical point that threatens operational stability and organizational competitiveness in the global market. The primary challenge faced by human resource practitioners is a reliance on static statistical models that fail to capture the temporal dimension and the evolving dynamics of risk. Conventional linear or logistic regression models often cannot accommodate censored data and may violate the proportionality assumption when applied to complex categorical variables such as profession. This study aims to model the determinants of turnover intention—including age, gender, and mode of transportation—by employing a more adaptive survival analysis approach. The main focus of the research is the application of a stratified Cox Proportional Hazards model to address violations of the Proportional Hazards assumption for the profession variable. Based on an analysis of 1,129 observations, the study identifies how turnover risk varies significantly across profession strata. We developed and compared two model configurations—with and without interaction terms—using the Akaike Information Criterion (AIC). While the non-interaction model proved most optimal for overall prediction (AIC: 5124.104), the interaction model revealed nuanced dynamics across professional strata. Key findings indicate that age generally increases turnover risk by 6.3% per year (HR: 1.063), and walking to work provides a protective effect, reducing risk by 13.6% (HR: 0.864) compared to bus usage. However, professional context significantly modulates these effects: in the 'Manage' stratum, age serves as a stabilizer (HR: 0.822), whereas male teachers face a risk 200.8% higher than their female counterparts (HR: 3.008). Furthermore, car usage in the 'Consult' stratum leads to a dramatic 423.5% increase in turnover risk (HR: 5.235). These results underscore the necessity of strata-specific retention strategies that prioritize workplace accessibility and demographic inclusivity. This study provides a robust data-driven framework for organizations to maintain workforce stability amidst the evolving labor landscape in Indonesia.
Stock Price Forecasting of PT Bank Rakyat Indonesia (Persero) Tbk Using the Support Vector Regression Method Widya Febriani Widya; Dony Permana
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/476

Abstract

Stock price forecasting is an important activity in the capital market because stock price movements tend to be nonlinear and volatile over time. PT Bank Rakyat Indonesia (Persero) Tbk (BBRI) is a blue-chip stock with high liquidity and strong fundamentals, making it an appropriate subject for forecasting research. This study aims to predict BBRI’s stock price using the Support Vector Regression (SVR) method, which is known for its ability to model nonlinear relationships and minimize overfitting. The data used consist of BBRI’s daily closing prices from January 2020 to December 2024. Before modeling, the data were normalized using the Min–Max method and divided into training and testing sets with an 80:20 ratio.The initial baseline model employed an SVR with a linear kernel. The model was then optimized using the Radial Basis Function (RBF) kernel through Grid Search Optimization combined with time-series cross-validation to determine the best parameter combination. Optimal parameters were selected based on the lowest Root Mean Square Error (RMSE). The results show that the SVR RBF model outperformed the linear model in capturing the nonlinear patterns of BBRI’s stock price. During testing, the optimized model achieved an RMSE of 0.022054, indicating high predictive accuracy. The optimized SVR model was subsequently used to forecast stock prices for the next period and demonstrated relatively stable yet dynamic price movements. Overall, the findings confirm that the SVR method is effective and reliable for stock price forecasting and can serve as a valuable reference for investors and future financial research.
Hybrid LBFA-Based Feature Selection for Improving Machine Learning Classification Performance in Heart Disease Prediction Hana Azizah; Eni Sumarminingsih; Adji Achmad Rinaldo Fernandes
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/478

Abstract

Feature selection and feature engineering are essential steps in developing accurate machine learning models, particularly when dealing with imbalanced datasets and redundant variables. However, many feature augmentation methods are often applied without a consistent preprocessing strategy, which can reduce model reliability and increase the risk of information leakage. To overcome this issue, this study proposes a hybrid classification framework that combines CatBoost-based feature selection with two feature augmentation techniques: LOGIT transformation and Log Density Ratio (LDR). A structured preprocessing pipeline was designed to ensure consistency throughout the modeling process. One-hot encoding was applied for the LOGIT transformation, while numerical standardization was used for LDR estimation. The generated features were then integrated with the selected original variables to produce richer feature representations for classification. The proposed framework was evaluated using the Heart Disease dataset with three gradient boosting algorithms, namely LightGBM, XGBoost, and CatBoost. Model performance was assessed using accuracy, precision, sensitivity, specificity, and F1-score. The results show that the proposed approach consistently improved classification performance across all models. Among the tested models, LightGBM combined with LOGIT and LDR achieved the best performance, obtaining an accuracy of 0.9618, precision of 0.9485, sensitivity of 0.9620, specificity of 0.9625, and F1-score of 0.9552. These findings suggest that combining feature selection with structured feature augmentation can significantly improve predictive performance in imbalanced classification tasks
Regresi Data Panel dengan Kesalahan Standar Driscoll-Kraay: Analisis Kejahatan dan Indikator Sosial Ekonomi di Sumatera Barat (2017–2024) Andini Diva Luthfiyah; Dhio Ervandi; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/479

Abstract

Criminal behavior is a complex social issue that threatens public safety and hinders regional development. In Indonesia, the crime rate varies across provinces and is influenced by multiple socioeconomic and structural factors. In West Sumatra Province, fluctuations in crime risk over time highlight the need for a deeper analysis of its determining factors. Understanding these factors is essential for the government to formulate effective and targeted crime prevention policies. This study aims to analyze the determinants of crime risk in West Sumatra Province using panel data from 2017 to 2024, covering 19 districts and cities, allowing for a more robust and comprehensive evaluation of both temporal and cross-sectional variations. The variables examined include the open unemployment rate, poverty rate, percentage of youth not in employment, education, or training (NEET), and the COVID-19 pandemic as a dummy variable. Panel data regression analysis was employed, and the results indicate that the most appropriate model is the Random Effects Model (REM). The findings show that the open unemployment rate and the pandemic variable have a significant effect on crime risk at the 5% significance level, while the poverty rate is significant at the 10% level. These results provide valuable insights for policymakers in addressing the root causes of crime in West Sumatra through employment generation, poverty alleviation, and preparedness for crisis situations.
Spatial Analysis of Open Unemployment Rate in West Java Province Using the Spatial Autoregressive Model Zulfadly Harman Harahap; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/482

Abstract

Unemployment remains a major socio-economic issue in West Java Province. The Open Unemployment Rate (OUR) is affected not just by local regional elements but also by the circumstances of adjacent regions, showing that spatial interdependence exists.The research aims to analyze the spatial pattern of OUR in West Java and identify the influencing factors using the Spatial Autoregressive (SAR) approach. The study uses cross-sectional secondary data from all regencies and cities in West Java for the year 2023. Moran’s I findings indicate a positive spatial dependence, suggesting that regions with high OUR are typically surrounded by regions with similarly high unemployment rates. According to the analysis using the Lagrange Multiplier test, the SAR model was chosen. Estimation results show that population growth rate and government expenditure significantly affect OUR. Additionally, the spatial lag coefficient shows a positive and significant value, suggesting spatial spillover effects. These findings highlight the importance of incorporating spatial perspectives in formulating regional employment policies.