cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 202 Documents
Logit and Complementary Log-Log Modeling in the Case of Factors Affecting Heart Failure Disease MAWARNI, IGA; Asyifa Dwi Ayshah; Dhiyaa Fitri Yafe; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/421

Abstract

Heart failure is one of the leading causes of morbidity and mortality globally. Heart disease is a disease caused by plaque that builds up in the coronary arteries that supply oxygen to the heart muscle. Research on heart failure disease aims to find out what factors affect heart failure disease and how much influence it has. This test was conducted using logistic regression method with logit modeling and complementary log-log modeling in analyzing data of 918 patients with heart failure disease. This study also takes which modeling is the best. The results of this analysis indicate that Age, Gender, Blood Sugar, and Chest Pain have significant effects on the likelihood of Heart Failure. Specifically, higher blood sugar levels and the presence of chest pain were found to increase the probability of heart failure, while gender and age showed varying effects across different age groups. Based on the model comparison, the Logit model demonstrated better fit and predictive accuracy than the Complementary Log-Log model, as reflected by its lower AIC value 897.43.
Evaluation of Prognosis and Duration of Survival in Breast Cancer Patients Using the Cox PH Model Meliza, Dela; Tessy Octvia Mukhti; Riza Sasmita; Celsy Aprotama; Rahmat Kurniawan
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/422

Abstract

Breast cancer is the leading cause of cancer-related deaths among women in Indonesia. Late detection and delayed treatment contribute significantly to this high mortality rate, as many patients seek medical care only after reaching advanced stages. Early detection through Breast Self Examination (BSE) and timely intervention can improve survival rates and quality of life. This study aims to evaluate the survival duration and influencing factors for breast cancer patients using clinical and genomic data from the METABRIC dataset, encompassing 1.980 primary breast cancer cases. The study employs survival analysis using Kaplan-Meier curves, Log-rank tests, and Cox proportional hazards regression to analyze the data. Results indicate significant differences in survival rates based on type of surgery and chemotherapy, while age at diagnosis shows no significant effect. The Cox proportional hazards model reveals that patients undergoing mastectomy have a 0.725 lower risk of death compared to those not undergoing the procedure, and patients receiving chemotherapy have a 1.869 higher risk of death. The findings underscore the importance of early and appropriate treatment in improving survival outcomes. This study contributes to the understanding of factors influencing breast cancer survival, aiding in better clinical decision-making and patient management strategies. Keywords: Breast Cancer, Cox Regression, Kaplan-Meier, Survival Analysis, Treatment Factors.
Metode DBSCAN dalam Pengelompokan Provinsi di Indonesia Berdasarkan Rasio Tenaga Kesehatan dan Tenaga Medis pada Tahun 2023 Maharani, Listia; Martha, Zamahsary; Permana, Dony; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/423

Abstract

Health is a fundamental right of every citizen. This right is realized in the form of health services. Good health services have an adequate ratio of health and medical personnel. However, in reality, there are still many provinces that have a shortage of health and medical personnel. Therefore, clustering is carried out to make it easier for the government to group provinces that have similarities in terms of the ratio of health and medical personnel in Indonesia in 2023. Density Based Spatial Clustering of Applications with Noise (DBSCAN) is one of the clustering methods used. Using the DBSCAN method, two clusters were obtained with a silhouette coefficient value of 0.49. Cluster 0 is called noise because the observation points in group 0 are outliers. Cluster 0 consists of provinces with a higher ratio of healthcare and medical personnel than cluster 1.
Modeling Infant Mortality in West Pasaman Regency With Negative Binomial Regression to Overcome Overdispersion Vinna Sulvia; Fitri Mudia Sari; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/424

Abstract

Infant mortality serves as a vital indicator of public health and an essential benchmark of development progress. Although the general trend shows a decline, several sub-districts in West Pasaman Regency continue to report relatively high infant mortality rates, raising concerns about the effectiveness of current health services. This study seeks to examine the determinants of infant mortality using count data regression models. The data were obtained from the publication West Pasaman Regency in Figures 2025 by Statistics Indonesia (BPS), consisting of one response variable, the number of infant deaths, and five independent variables: the percentage of Low Birth Weight (LBW), the proportion of deliveries assisted by medical personnel, the proportion of pregnant women enrolled in the K4 program, the number of health workers, and the number of health facilities. The initial analysis employed a Poisson regression model, which assumes equidispersion, but the results revealed evidence of overdispersion. To address this issue, negative binomial regression was adopted as an alternative approach. Model evaluation using the Akaike Information Criterion (AIC) and the Likelihood Ratio Test confirmed that the negative binomial regression provided a better fit than Poisson regression. The results indicate that the percentage of LBW and the number of health facilities significantly influence infant mortality. Low birth weight (LBW) had a positive association with infant mortality, consistent with theory, while the positive effect of health facilities differed from expectations, possibly due to issues of quality, distribution, or reverse causality. 
Penalized Spline Regression Modeling on the Human and Cultural Development Index (IPMK) for 2022 Mila, Sarmilah; Fadhilah Fitri; Musthafa Imran
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/425

Abstract

Human and cultural development is a multidimensional phenomenon whose relationship with socioeconomic factors is often complex and nonlinear, making it challenging to model with conventional parametric approaches. This study aims to model the influence of socioeconomic variables on the Human and Cultural Development Index (IPMK) across 34 provinces in Indonesia in 2022 using the nonparametric Penalized Spline (P-spline) regression method within a Generalized Additive Model (GAM) framework. Secondary data from the Central Statistics Agency (BPS) were used, with predictor variables including School Participation Rate (APS), percentage of access to safe drinking water, Gini Ratio, per capita expenditure, average years of schooling (RLS), and open unemployment rate (TPT). Initial data exploration via scatterplots confirmed nonlinear relationship patterns between the predictor variables and IPMK. The best model was obtained using a first-order cubic spline with 10 knot points, selected based on the minimum Generalized Cross Validation (GCV) criterion. The modeling results demonstrated excellent performance, with an Adjusted R² value of 0.842 and a Deviance Explained of 92.3%. Significance analysis indicated that access to safe drinking water, per capita expenditure, average years of schooling, and the open unemployment rate significantly influence IPMK. Visual interpretation of the significant spline curves revealed informative relationship patterns, such as the diminishing returns effect of per capita expenditure. This study concludes that the P-spline approach is effective and interpretable for modeling complex nonlinear relationships in development data, providing a richer evidence base for policy formulation.
Application of K-Means Clustering for Grouping Plantation Production in West Pasaman Regency in 2024 Dini Andita Putri; Fitri Mudia Sari; Chairini Wirdiastuti
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/426

Abstract

The plantation sector plays a strategic role in supporting the economy of West Pasaman Regency, with major commodities including oil palm, coconut, rubber, cocoa, and patchouli. However, disparities in production across subdistricts require further analysis to identify regions with similar characteristics. This study applies the K-Means Clustering method, with the optimal number of clusters determined using the Elbow Method. The results show three clusters: the first with relatively balanced production, the second dominated by rubber and cocoa, and the third represented by Kinali District with high dominance of oil palm, coconut, and patchouli. These findings indicate that K-Means Clustering can effectively map regional plantation potentials and provide a useful basis for formulating targeted development strategies to optimize resource allocation and support sustainable agricultural planning in West Pasaman Regency.
Grouping Of Universities In Indonesia In 2025 Based On The Qs World University Rankings Ranking Indicator Using The Kohonen Self-Organizing Maps Algorithm Raihan Athaya Wudd; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/427

Abstract

Increasing the competitiveness of higher education is one of the main focuses in facing global competition. One of the important indicators in assessing the quality of higher education institutions is the QS World University Rankings which assesses universities based on indicators such as academic reputation, citations per lecturer, sustainability, and international collaboration. This study aims to group universities in Indonesia that are included in the QS World University Rankings in 2025 using the Kohonen Self-Organizing Maps (SOM) algorithm. The data used consisted of 10 QS assessment indicators for 26 universities in Indonesia. The normalization process is carried out using the min-max method, and the optimal number of clusters is determined using internal validation indices such as Connectivity, Dunn, and Silhouette. The results of the analysis show that the best models form three main clusters. Cluster 1 contains universities with superior performance in reputation and research, cluster 2 contains universities with a fairly balanced medium performance, and cluster 3 consists of universities with low performance in key indicators. The results of this study are expected to be the basis for policy makers and university managers to develop strategies to improve the quality of higher education in a targeted manner.
Peramalan Konsentrasi PM2.5 di Kota Medan Menggunakan Metode ARIMAX dengan Faktor Meteorologi sebagai Variabel Eksogen Fauzan Arrahman; Tessy Octavia Mukhti; Dony Permana; Fenni Kurnia Mutiya
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/429

Abstract

Particulate Matter 2.5 (PM2.5) is a fine particle measuring less than 2.5 micrometers which is dangerous for human health because it can penetrate the respiratory system and cause cardiovascular disorders. High PM2.5 concentrations reflect a decline in air quality, so forecasting efforts are needed to support pollution control and environmental policies. This study aims to forecast daily PM2.5 concentrations in Medan City using the Autoregressive Integrated Moving Average with Exogenous Variables (ARIMAX) method by considering meteorological factors as exogenous variables. The data used consist of PM2.5 concentrations and average temperature, humidity, rainfall, and wind speed data for the period from June 1, 2024 to June 10, 2025. The analysis results show that the best model is ARIMAX (4,1,0) with exogenous variables of average temperature and rainfall, where temperature has a positive effect and rainfall has a negative effect on PM2.5. This model meets the assumptions of white noise and residual normality, with a MAPE value of 20.635%, indicating a fairly good level of forecasting accuracy. The forecasting results show PM2.5 concentrations in the range of 19–26 µg/m³ with a downward trend at the end of June 2025, indicating improved air quality in Medan City. Thus, the ARIMAX method with meteorological factors is considered effective in modeling and forecasting PM2.5 dynamics in urban areas.
Analisis Pengaruh Penggunaan ChatGPT Terhadap Prestasi Akademik Mahasiswa Dengan Motivasi Sebagai Variabel Intervening Menggunakan Metode SEM-PLS Salsabilla Khairani; Yenni Kurniawati; Dony Permana; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/430

Abstract

This study aims to analyze the factors that influence student academic achievement through the use of ChatGPT using the Structural Equation Modeling (SEM) method based on the Partial Least Square (PLS) approach. In this study, three main factors were identified as elements that can influence the use of ChatGPT, namely knowledge about ChatGPT (PTC), willingness to use the technology (KUMT), and concerns that may arise (KYDT), as well as learning motivation as an intervening variable. The total sampling method was used in this study, where the entire population that met the criteria was designated as respondents. The research population included students in the Statistics Study Program at Padang State University in semesters 4–8 who had used ChatGPT for at least six months, with a total of 216 student respondents. Data were collected through a survey using an online questionnaire. Based on the analysis that has been carried out, the results of the study show that the variables of knowledge about ChatGPT (PTC) and willingness to use the technology (KUMT) have a significant positive effect on learning motivation, while concerns that may arise (KYDT) have no significant effect. Furthermore, only the variable of concerns that may arise (KYDT) had a significant direct effect on academic achievement, while the results of the mediation effect test showed that only the variable of willingness to use the technology (KUMT) had a significant indirect effect on academic achievement through learning motivation.
Classification of Recipients of the Family Hope Program in West Sumatra Province Using the Random Forest Algoritma Nini Erdiani; Dwi Sulistiowati; Nonong Amalita; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 3 No. 4 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss4/431

Abstract

According to the Central Statistics Agency (BPS), the percentage of poor people in West Sumatra Province increased by 0.02% in 2024. One of the government's efforts to overcome poverty is a social assistance program issued by the government to help people who are economically disadvantaged. The targeted distribution of social assistance is an important challenge in improving community welfare, especially for families receiving PKH benefits. This study aims to classify households receiving the Family Hope Program (PKH) in West Sumatra Province using a random forest algorithm with Synthetic Minority Oversampling Technique (SMOTE). This study uses data on PKH recipient households in West Sumatra Province in 2024, which has a significant class imbalance. Therefore, the SMOTE method was applied to balance the data. The data was divided into training and testing data with a ratio of 80%:20%, then parameter tuning was performed to optimize mtry and ntree. The model was evaluated using a confusion matrix to compare model performance. The results show that the accuracy obtained is 76%. The precision value is 72%, the recall is 84%, and the f1-score is 78%. Based on the Mean Decrease Gini value, the head of household's diploma became the main attribute in determining whether a household received PKH or not. This study concluded that the use of SMOTE in the random forest algorithm performed well in classifying PKH recipients in West Sumatra Province, where the model performed well and was quite reliable in identifying PKH recipients.