cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 236 Documents
Sentiment Analysis of Public Opinion on Rupiah Redenomination on Twitter Using Naive Bayes Classification FIGO RAHMATULLAH; Dila Sari; Rahmat Kurniawan; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/484

Abstract

This study examines public opinion on the Rupiah redenomination policy through sentiment analysis of Twitter data. Redenomination refers to the simplification of currency denominations without changing their real value, a policy that often triggers varied public responses due to concerns such as inflation perception and money illusion. In the digital era, Twitter (currently X) serves as a major platform for real-time public expression, generating large volumes of unstructured textual data suitable for analysis. The objective of this research is to classify public sentiment toward the Rupiah redenomination policy into positive, negative, and neutral categories using the Naive Bayes Classifier, as well as to evaluate the model’s performance. The dataset consists of Indonesian-language tweets collected via the Twitter API using keywords related to redenomination. Data processing involves several stages, including data cleaning, manual labeling, text preprocessing (case folding, tokenization, stopword removal, and stemming), and feature extraction using Term Frequency–Inverse Document Frequency (TF–IDF). The classification results are evaluated using a confusion matrix. The Naive Bayes Classifier achieved an accuracy of approximately 74.84% and a precision of 80%, indicating that the model performs adequately in identifying sentiment patterns. The findings show that neutral sentiment dominates the discussion, suggesting that most users tend to provide informational or observational opinions rather than strong support or opposition. These results are expected to provide insights for policymakers, particularly Bank Indonesia and the government, regarding public acceptance of the redenomination policy, while also contributing to the development of sentiment analysis research on Indonesian social media data.
Application of the Cox Proportional Hazards Model to Analyze Survival Times in Women with Breast Cancer Rahmadani; Vinna Sulvia; Fathina Nafisa; Septrina Kiki Arisandi; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/485

Abstract

Breast cancer is still claimed to be one of the most number causes of cancer-related mortality all round the world, highlighting the importance of identifying factors that influence patient survival time. Variations in clinical outcomes among patients indicate the need for appropriate statistical methods to evaluate prognostic factors. This studi aims to analyze factors affecting the survival time by applying the Cox Propotional Hazard (Cox PH) model. The data consist of breast cancer patient record with several predictor variabel, including age at diagnosis, type of breast surgery, chemotherapy, hormone therapy, Nottingham Prognostic Index, and tumor size. The analysis procedure includes testingthe propotional hazards assumption and assessing parameter significance using the likelihood ratio test for simultaneous affect and also the test of wald for partial effect. The resuls show that the propotional hazards assumption is satisfied, indicating that the Cox PH model is appropriate for the data. Simultaneous testing reveals that at least one predictor significanly affect survuval time, while partial testing identifies type of surgery, chemotherapy as significant factors. The hazard ratio estimates indicate that patients undergoing mastectomy have a lower risk of death compared to those receiving breast-conserving surgery. Conversely, chemotherapy and hormone theraoy are associated with a higher risk of death, wich may reflect the more severe clinical conditions of patients receiving these treatments. In conclusion, the Cox PH model provides a reliable approach for identifying key factors influetncing breast cancer survival and offers important implications for clinical decision-making and treatment planning.
IHSG Closing Price Prediction on the Indonesian Stock Exchange using the Geometric Brownian Motion Model Sukra Hamna; Devni Prima Sari
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/486

Abstract

Being among the leading primary benchmarks reflecting the health of the equity market in Indonesia, the Jakarta Composite Index (IHSG) experiences ongoing price movements shaped by a wide spectrum of domestic and international forces. The inherent unpredictability of these movements underscores the critical need for reliable forecasting methods to guide investors in their decision-making process. In response to this, the present study applies the Geometric Brownian Motion model as a tool for projecting the daily closing values of the IHSG, owing to its well-recognized ability to represent the random characteristics inherent in financial time series. The dataset utilized comprises daily closing price records of the IHSG throughout 2025. The analysis includes the calculation of log returns, normality testing using the Kolmogorov-Smirnov test, and estimation of drift and volatility parameters. Forecasting is performed using simulation with 50 and 1000 iterations, where the initial value is based on the last observed closing price. The findings reveal that the Geometric Brownian Motion model demonstrates a solid capacity to reflect the volatile behavior of IHSG movements, yielding MAPE figures of 4.50% and 2.81%, which correspond to a very high level of predictive precision. A greater number of iterations was found to produce more consistent and dependable projections, while the estimated values broadly align with the overall trajectory of historical data, notwithstanding the element of randomness embedded in the model. Therefore, the GBM model can be considered an effective method for forecasting stock price movements, particularly for highly volatile market indices such as the IHSG.
A Self-Organizing Map Approach for Clustering Provinces Based on Multisectoral Indicators of Stunting Determinants Admi Salma; Riwi Dyah Pangesti; Reny Wulandari
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/487

Abstract

Stunting is a national issue in Indonesia and also a global challenge.  It becomes one of the key priorities outlined in the Sustainable Development Goals (SDGs). The heterogeneity of multisectoral conditions across provinces also contributes to the variation in stunting prevalence in Indonesia. The implementation of uniform policies to address stunting may not yield optimal results due to the diverse needs of each province. Therefore, specific interventions are required to overcome stunting issues. Based on this condition, it is important to cluster provinces based on their characteristics so that the government can determine appropriate interventions for each provincial cluster. Visualization of stunting conditions and multisectoral indicators can also enrich the understanding of each cluster. This study aims to construct clusters of provinces with similar characteristics in terms of multisectoral indicators of stunting determinants. This study applies cluster analysis using a Self-Organizing Map (SOM) algorithm to group provinces. The research steps include data preprocessing, clustering using the SOM algorithm, SOM mapping, and cluster characterization analysis. The results of this study show that three clusters were obtained. The first cluster consists of three provinces characterized by a high maternal mortality rate and a high percentage of exclusive breastfeeding. The second cluster includes nine provinces and is characterized by high risks in maternal and child health as well as economic vulnerability. In addition, the third cluster consists of 26 provinces characterized by relatively good living conditions and quality education.
Mapping Anxiety, Developing Solutions: A Statistical Study of Student Anxiety Using The K-Modes Clustering Method Fadhilah Fitri; Fitri Mudia Sari; Fauziah Taslim; Sri Wahyuni
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/491

Abstract

Statistics anxiety is a common issue among university students that can negatively affect their learning process and academic performance. This study aims to identify patterns of statistics anxiety among undergraduate students at Universitas Negeri Padang using the Statistics Anxiety Rating Scale (STARS), which consists of six dimensions. A total of 479 valid responses were analyzed using the k-modes clustering method, which is appropriate for categorical data. The optimal number of clusters was determined using the elbow and silhouette methods, resulting in three clusters. The clustering results reveal three distinct groups of students characterized by high, moderate, and low levels of statistics anxiety. The average silhouette value of 0.52 indicates a moderately well-defined cluster structure. Further analysis shows that each cluster exhibits different patterns across the six anxiety dimensions, highlighting the heterogeneity of students’ responses to statistics. These findings suggest that clustering provides a more informative approach than conventional descriptive analysis in understanding statistics anxiety. The results of this study can serve as a basis for developing targeted strategies to reduce student anxiety in statistics learning
Evaluating Local Parameter Reliability in Hierarchical Geographically Weighted Regression: A Bootstrap and Sign Consistency Approach Fitri Mudia Sari; Muhammad Nur Aidi; Agus Mohamad Soleh; Farit Mochamad Afendi
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/492

Abstract

The Hierarchical Geographically Weighted Regression (HGWR) model is widely used to capture spatial heterogeneity and hierarchical data structures simultaneously. However, the reliability of its local parameter estimates remains a critical issue due to potential variability across locations. This study aims to evaluate the reliability of local parameters in the HGWR model using a bootstrap-based approach combined with sign consistency analysis, using an empirical stunting prevalence dataset in Indonesia. A cluster bootstrap procedure at the provincial level was implemented with 500 replications to generate empirical distributions of parameter estimates, enabling the assessment of statistical significance through confidence intervals. In addition, sign consistency was employed to examine the stability of the direction of local effects across bootstrap replications. The results show that while some local parameters are statistically significant, they do not always exhibit consistent directional effects, indicating potential instability. Conversely, several parameters demonstrate both statistical significance and high sign consistency, suggesting robust local relationships. These findings highlight that relying solely on statistical significance may lead to misleading interpretations of local effects in HGWR models. The combination of bootstrap and sign consistency provides a more comprehensive framework for assessing parameter reliability. This approach contributes to improving the interpretability and robustness of spatial multilevel modeling, particularly in applications involving complex hierarchical and spatial data.
Poverty Modeling in East Nusa Tenggara Using Fourier Nonparametric Regression with Cosine–Sine Comparison and Hypothesis Testing Narita Yuri Adrianingsih; Andrea Tri Rian Dani; I Nyoman Budiantara; Vita Ratnasari; Yossy Candra; Bintang A. Banewang; Leti S. Gaimau
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/493

Abstract

Poverty is a complex multidimensional issue and remains a major development challenge in Indonesia, particularly in East Nusa Tenggara (NTT), which consistently records one of the highest poverty rates nationally. Conventional parametric approaches, such as linear regression, are often inadequate to capture the nonlinear and complex relationships between socioeconomic factors and poverty levels. Therefore, this study proposes a nonparametric regression approach based on Fourier series to model poverty in NTT. The novelty of this research lies in the systematic comparison between cosine-based and sine-based Fourier components within a nonparametric regression framework, combined with inferential statistical testing to identify significant determinants of poverty. The study uses cross-sectional data from 22 districts/cities in NTT for the year 2025. Model estimation is conducted using the Ordinary Least Squares (OLS) method, while the optimal oscillation parameter is determined using Generalized Cross-Validation (GCV). Model performance is evaluated using MSE, RMSE, MAPE, and coefficient of determination (R²). The results show that the cosine-based Fourier model with three oscillations outperforms the sine-based model, achieving MSE of 1.903, RMSE of 1.379, MAPE of 5.817%, and R² of 95.146%. Hypothesis testing indicates that all predictor variables significantly influence poverty levels both simultaneously and partially. These findings demonstrate that the Fourier nonparametric regression approach is highly effective in capturing complex and fluctuating poverty patterns, and it provides a more accurate and interpretable model for supporting targeted poverty alleviation policies.
Agricultural Involution in Indonesia: A Generalized Structured Component Analysis (GSCA) Approach with Land and Labor Interaction Effects Urwawuska Ladini; Sella Nofriska Sudrimo; Dahlia Misrika
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/494

Abstract

Indonesian agriculture faces an economic paradox where the sector remains a primary employer despite low wages and stagnant GDP contributions compared to industry. This study aims to analyze and quantify the phenomenon of agricultural involution in Indonesia from 2017 to 2023 by simultaneously examining the effects of land, labor, productivity, and land–labor interaction on agricultural output across 34 provinces. Generalized Structured Component Analysis (GSCA) with an Alternating Least Squares (ALS) approach is employed because of its ability to handle mixed formative–reflective measurement models and accommodate latent variable interaction effects —  capabilities unavailable in conventional covariance-based SEM or linear regression. The results indicate that land capacity is the dominant determinant of agricultural output with a path coefficient of 0.958, signaling that growth remains extensive rather than intensive. Crucially, labor intensity is found to have a significant negative effect on productivity and total output, confirming the law of diminishing marginal returns and the presence of labor surpluses that exceed optimal points. Furthermore, the interaction between land and labor yields a significant negative coefficient (-0.109), proving that demographic pressure on limited land exacerbates inefficiency and output destruction. Spatial post-hoc analysis indicates that agricultural involution is no longer confined to Java but has evolved into a national phenomenon, as demonstrated by the absence of significant disparities in labor-to-land ratios and productivity between Java and other regions. These findings suggest that sustainable transformation requires integrated policies for land protection, labor restructuring toward non-agricultural sectors, and technological modernization to break the cycle of involution.
Comparison of District/City Clusters in West Sumatra Province 2019–2025 Based on Labor Indicators Using K-Means Method Naila Marettania; Zilrahmi; Mellisa Ayuningtyas
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/495

Abstract

This study is motivated by the differences in labor conditions among regencies/cities in West Sumatra Province, as indicated by the Open Unemployment Rate (OUR) and the Labor Force Participation Rate (LFPR). In addition, the impact of the COVID-19 pandemic and the economic recovery process during the 2019–2025 period are assumed to have caused changes in labor characteristics across regions. However, the patterns of similarities and differences in labor conditions among regions have not been clearly identified, making it necessary to conduct a regional clustering analysis based on labor characteristics. This study aims to analyze the clustering of regencies/cities in West Sumatra Province based on the OUR and LFPR indicators during 2019–2025. The data used were obtained from the Central Statistics Agency, covering 19 regencies/cities. The analytical method applied was K-Means clustering using Euclidean distance, while cluster validation was conducted using the Silhouette Coefficient. This study used two clusters to facilitate the interpretation of results. The findings show that the regencies/cities in West Sumatra Province were divided into two clusters with different characteristics. Cluster 1 represents regions with better labor conditions, characterized by lower OUR and higher LFPR, while Cluster 2 represents regions with relatively poorer labor conditions, characterized by higher OUR and lower LFPR. Cluster membership changed from year to year, indicating dynamic labor conditions across regions. The results of this study are expected to serve as a basis for formulating more targeted labor policies according to the characteristics of each region.
K-Medoids Clustering Analysis of Regional Development in West Sumatra Based on Socioeconomic Indicators Kayla Faradina; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol4-iss2/496

Abstract

Regional development disparities among districts and cities in West Sumatra Province remain a persistent challenge, reflected in significant differences across economic, social, and employment indicators. This study aims to cluster 19 districts/cities in West Sumatra Province based on socioeconomic indicators using the K-Medoids clustering method. The variables include GRDP per capita, economic growth rate, GRDP percentage distribution, Human Development Index (HDI), poverty rate, and open unemployment rate, using 2024 data obtained from the Central Bureau of Statistics (BPS) of West Sumatra Province. The optimal number of clusters was determined using the Elbow method, resulting in three clusters. Cluster 1 consists of 12 districts characterized by the lowest average GRDP per capita and HDI, along with the highest poverty rate. Cluster 2 comprises only Kota Padang, which recorded the highest values across most indicators including GRDP per capita, economic growth rate, and HDI, yet also exhibited the highest open unemployment rate. Cluster 3 includes 6 cities with relatively high HDI and the lowest poverty rate among the three clusters. Cluster validation using the Davies-Bouldin Index (DBI) produced a value of 0.8341, indicating that the clustering results are optimal. The findings are expected to provide a reference for local governments and the Regional Development Planning Agency (Bappeda) of West Sumatra Province in formulating more targeted regional development policies based on the characteristics of each cluster.