cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 202 Documents
Metode Density Based Spatial Clustering of Applications with Noise (DBSCAN) dalam Mengelompokkan Provinsi di Indonesia Berdasarkan Kasus Kriminalitas Tahun 2022 Miftahurrahmi, Syifa; Zilrahmi; Amalita, Nonong; Mukhti, Tessy Octavia
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/203

Abstract

Based on Central Statistics Agency 2023 data, in 2022 there was a significant increase in the number of crime cases in Indonesia compared to 2021, from 239,481 cases to 372,965 cases. The increase in the number of criminal acts occurred along with community activities that began to loosen up after the Covid-19 pandemic. The types of crimes that occur in Indonesia themselves vary, ranging from murder, theft, drug-related crimes, and others. This research will cluster provinces in Indonesia based on crime cases with certain types of crimes in 2022 using the Density Based Spatial Clustering of Applications with Noise (DBSCAN) method. The results of the study are expected to help the government and police in an effort to deal with crime in Indonesia. Clustering using the DBSCAN method produces 2 clusters with a silhouette coefficient value of 0,68. The resulting cluster is cluster 0 with noise category consisting of 5 provinces with a high number of crime cases, while cluster 1 consists of 29 provinces with a low number of crime cases.
Application of Multivariate Adaptive Regression Splines for Modeling Stunting Toddler on The Island of Java Rahma, Dzakyyah; Nonong Amalita; Yenni Kurniawati; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/205

Abstract

Stunting is a chronic nutritional problem experienced by toddlers, characterized by a shorter body height compared to children their age. The aim of this research is to model and determine the factors that influence Stunting on The Island of Java using Multivariate Adaptive Regression Spline (MARS). MARS is a modeling method that can handle high-dimensional data. The results of this study show that the best MARS model is a combination (BF=24, MI=3, and MO=2) with a minimum GCV value of 0.9475. Based on the model, the factors that significantly influence Stunting on the island of Java are babies receiving complete basic immunization (X4), babies getting exclusive breastfeeding (X3), pregnant women getting K4 (X1), and pregnant women getting TTD (X2). The level of importance of each variable is 100%, 81.64%, 60.38%, and 43.90%. Based on research results, babies receiving complete basic immunization is the variable that most influences stunting on The Island of Java in 2021.
Comparison of Linear Discriminant Analysis with Robust Linear Discriminant Analysis Fitri, Fitri Hayati; Dodi Vionanda; Yenni Kurniawati; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/206

Abstract

Discriminant analysis is a multivariate method for dividing things into discrete groups and assigning new objects to existing categories. A discriminant function, which is a linear combination of independent variables used to categorize things into two or more groups or categories, is the result of discriminant analysis. The independent variables in a linear discriminant analysis must be multivariate normally distributed, and the covariance matrices for each group must be equal. In linear discriminant analysis, it is also essential to identify outliers because their existence in the data set can undermine the assumptions made by the method and lead to incorrect classification results. Therefore, in discriminant analysis, handling outliers with robust approaches is required. One such robust method in discriminant analysis is the Minimum Covariance Determinant (MCD), which is highly effective in dealing with outliers and relatively easier to apply compared to other robust methods. The aim of this study is to compare the classification results of linear discriminant analysis with robust linear discriminant analysis on the dataset of diabetes patients at RSUD Padangsidimpuan in 2023. The results obtained from this dataset indicate that linear discriminant analysis achieved an accuracy of 85,71%, while robust linear discriminant analysis achieved an accuracy of 80,95%. These findings suggest that the use of liniar discriminant analysis and robustt linear discriminant analysis can yield different results depending on the characteristics of the data and the number of outliers in the dataset.
Mixed Geographically Weighted Regression Modeling of Gender Development Index in Indonesia Nikma Hasanah; Dodi Vionanda; Syafriandi Syafriandi; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/207

Abstract

The Gender Development Index (GDI) is one of the primary measures of gender equality in the field of human development. Indonesia's GDI statistics for 2023 show the development gap between men and women. Using Mixed Geographically Weighted Regression (MGWR), a blend of regression and Geographically Weighted Regression (GWR) models, to identify the factors influencing GDI is one approach to closing the gap. The results showed that when it came to value selection using the Akaike Information Criterion (AIC), the MGWR model outperformed the GWR model. Population with health complaints and adjusted per capita expenditure were found to be globally influential factors, while female participation in parliament, open unemployment rate, and labor force participation rate were found to be locally influential factors by the MGWR model with Adaptive Kernel Bisquare weights.
Implementation of the Fuzzy C-Means Clustering Method in Grouping Provinces in Indonesia based on the Types of Goods Sold in E-commerce Businesses in 2022 Bimbim Oktaviandi; Tessy Octavia Mukhti; Yenni Kurniawati; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/210

Abstract

The internet facilitates e-commerce by enabling efficient transactions and building consumer trust. With internet users in Indonesia reaching 204 million in 2022, it is crucial to Cluster provinces based on the types of goods and services sold online to design effective marketing strategies. The Fuzzy C-Means (FCM) method is used for Cluster analysis, allowing objects to have different membership degrees in multiple Clusters and providing accurate Cluster center placement. This study applies Fuzzy C-Means to Cluster 34 provinces in Indonesia based on the sale of goods/services in e-commerce in 2022, aiming to provide insights into market preferences and assist companies in developing more effective strategies. The results show that the method forms two Clusters. By evaluating standard deviation values and ratios, Fuzzy C-Means proves effective in Clustering provinces in Indonesia based on e-commerce sales data. Cluster validation reveals a standard deviation ratio of 0.14, indicating clear and significant Cluster separation.
Comparison Of Extreme Learning Machine And Holt Winter’s Exponential Smoothing Methods In Railway Passenger Forecasting Azma, Meil Sri Dian; Dony Permana; Fadhilah Fitri; Atus Amadi Putra
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/211

Abstract

Forecasting the number of passengers on the Pariaman Express train is an activity that is considered to have the potential to help PT KAI in maximizing passenger service facilities and comfort. It is estimated that the number of train passengers in Indonesia will always increase along with the increasing population of Indonesia. The high interest of users of this mode of transportation can be seen from historical data that continues to increase every year. PT KAI (Persero) as a single train transportation provider company needs to have several strategies in providing and meeting passenger needs every day. In the study of forecasting the number of passengers on the Pariaman Express train using the Holt Winters exponential smoothing method and one of the artificial neural network methods, namely the extreme learning machine. The purpose of this study was to determine the comparison of the accuracy values ​​of the forecast results produced by the two methods, and to find out which method is good to use in this forecast. The data used is data on the number of Pariaman Express train passengers from 2021-2023. The results of the study show that the comparison of the accuracy values ​​of the forecasting of the number of train passengers shows that the Holt Winter's and ELM methods have error values ​​above 10%, meaning that the Holt Winter's and ELM methods are good at forecasting for 4 periods. Holt Winter's has a MAPE value of 17.10% and ELM has a MAPE value of 20%.
Evaluasi Faktor-Faktor Yang Memengaruhi Indeks Pembangunan Manusia Tahun 2023 Menggunakan Metode SEM-PLS Putri, Sindy Amelia; Zilrahmi; Permana, Dony; Fitria, Dina
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/214

Abstract

The human development index (HDI) is a measure of the success of development in a country. Indonesia as a developing country in 2022 has an HDI value that ranks 112 out of a total of 193 countries in the world. This indicates that there is an urgent need for evaluation in increasing the HDI value in Indonesia which leads to an increase in the quality of human development. The evaluation can be done using the Structural Equation Modeling-Partial Least Square (SEM-PLS) analysis method. With 34 Indonesian provinces as observations, there are three dimensions as variables analyzed in this paper, namely economy, education, and health. These variables are analyzed based on each indicator variable. The results of the analysis show that in the economic variable, the influential indicators are the Open Unemployment Rate, GRDP per Capita at Constant Prices, and Average Wage per Hour Worker. Then in the education variable, the influential indicators are the School Participation Rate Age 7-12, the School Participation Rate Age 13-15, the Pure Enrollment Rate for Elementary/Middle School/Package A, the Pure Enrollment Rate for Junior High School/MTs/Package B, and the Pure Enrollment Rate for Senior High School/SMK/MA/Package C. Furthermore, in the health variable, there are indicators of the Percentage of Households by Province and Source of Adequate Drinking Water, and the Percentage of Ever-Married Women Aged 15-49 Years whose Last Childbirth Processed in a Health Facility which affect the value of HDI in Indonesia in 2023.
Pemodelan Tingkat Partisipasi Angkatan Kerja Terhadap Persentase Penduduk Miskin di Jawa Timur Tahun 2023 Menggunakan Metode B-Spline Ibnul farizi, Gilang; Zilrahmi; Dony Permana; Admi Salma
UNP Journal of Statistics and Data Science Vol. 2 No. 4 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss4/215

Abstract

Poverty is a common issue in Indonesia. Data on the Percentage of Poor Population against the Labor Force Participation Rate (LFPR) per district/city, consisting of 38 districts/cities in East Java Province in 2023, indicates that the highest percentage of poverty in East Java Province in 2023 was 21,760. Employment is considered the most effective solution to alleviate poverty. The data in this study shows a distribution pattern that does not form a specific pattern, making it difficult to analyze using parametric methods. Therefore, the appropriate approach is Nonparametric Regression. In this study, the nonparametric regression used is the B-Spline regression model. The suitability of the model is based on the Mean Squared Error (MSE) value of the model. The analysis results indicate that the B-Spline regression model achieves an MSE value of 20.11447. The optimal MSE value is obtained from B-Spline estimation with order 2. This suggests that the B-Spline method provides a good explanation in addressing the issue
Estimation of Poverty in North Sumatera in 2022 using Truncated and Penalized Spline Regression Kurnia Andrea Diva; Fadhilah Fitri; Dony Permana; Admi Salma
UNP Journal of Statistics and Data Science Vol. 2 No. 4 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss4/217

Abstract

The Sustainable Development Goals' main goal is to reduce poverty (SDGs). Low human capital is the cause of poverty. The Human Development Index is one indicator that can be used to assess human capital (HDI). Despite having the largest population on the island of Sumatra, North Sumatra continues to have the fifth highest poverty rate. Because the pattern of the relationship between poverty and HDI based on previous research is still unclear because the results are inconsistent, nonparametric regression modeling was used in this study because it is flexible in following the pattern of data relationships and can avoid model prespecific errors. This study aims to compare the Spline Truncated and Penalized Spline regression methods. The results of the comparison between the Truncated Spline regression model and the P-Spline regression model by looking at the smallest MSE value showed that a better estimator for modeling the Human Development Index in North Sumatera in 2022 is non-parametric regression using the truncated spline estimaor. where the best truncated spline modeling is at order 2 with one knot point located at X = 66.93 with a GCV value of 6.0543.
Optimization of Sentiment Analysis for MBKM Program using Naïve Bayes with Particle Swarm Optimization Diva Aliyah; Zilrahmi; Yenni Kurniawati; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 4 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss4/220

Abstract

In early 2020, Kemendikbudristek launched the MBKM program with the aim of improving the quality of higher education through a student-focused learning approach. The launch of this program triggered various reactions on social media, especially on Twitter, both positive and negative. This study aims to analyze the sentiment of Twitter users towards the MBKM program using the Naive Bayes algorithm optimized with Particle Swarm Optimization (PSO). The data used are Indonesian tweets containing the keywords "MBKM" and "Merdeka Campus" from the period July to December 2022. The research stages include data collection through crawling, manual labeling of data into positive and negative sentiments, data preprocessing, application of the Naive Bayes algorithm, and feature selection with PSO. The results showed that the group of tweets categorized based on positive and negative sentiments towards the implementation of the MBKM program in Indonesia in 2022, showed that the NB-PSO experiment achieved an accuracy of 90.87%, an increase of 7.12% compared to the Naive Bayes algorithm alone. Thus, the use of Particle Swarm Optimization algorithm in Naive Bayes classification algorithm is proven to improve classification performance, especially in the case of sentiment analysis. Keywords: Sentiment Analysis, Merdeka Belajar Kampus Merdeka, Twitter, Naive Bayes, Particle Swarm Optimization.