cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 202 Documents
Comparison of Distance Function in K-Nearest Neighbor Algorithm to Predict Prospective Customers in Term Deposit Subscriptions Muhammad Tibri Syofyan; Nonong Amalita; Dodi Vionanda; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/47

Abstract

Data mining is often used to analysis of the big data to obtain new useful information that will be used in the future. One of the best algorithms in data mining is K-Nearest Neighbor (KKN). K-NN classifier is a distance-based classification algorithm. The distance function is a core component in measuring the distance or similarity between the tested data and the training data. Various measure of distance function exist make this a topic of kind literature problems to determining the best distance function for the performance of the K-NN classifier. This study aims to compare which distance function produces the best K-NN performance. The distance function to be compared is the Manhattan distance and Minkowski distance. The application of K-NN classifier using bank dataset about predict prospective customers in Term Deposit Subscriptions. This study show that Minkowski distance on K-NN algorithm achieved the best result compared to Manhattan distance. Minkowski distance with power p = 1.5 produces an accuracy rate of 88.40% when the K value is 7. Thus, performance of K-NN algorithm using Minkowski distance (p=1,5, K=7) is best algorithm in predicting prospective costumers in Term Deposit Subscription
Rainfall Forcasting in Medan City Using Singular Spectrum Analysis (SSA) Silvia Agustina; Fadhilah Fitri; Dodi Vionanda; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/52

Abstract

Singular spectrum analysis is a time series analysis that can be used for data that has seasonal effects. Rainfall is one example that has a seasonal effect. High rainfall has an impact on natural disasters such as floods. Medan city is the capital city of North Sumatra province which has quite high rainfall and is a lowland area, so it has the potential for flooding. Rainfall forecasting can be done as disaster mitigation. The forecasting method used is SSA. The MAPE forecasting accuracy value obtained is 15.5% and the tracking signal is within tolerance limits, so that it can be concluded that the forecasting is done well.
Classification for Covid-19 Affected Family Cash Aid Recipients Using Naïve Bayes Algorithm Mutiara Amazona Sosiawati; Syafriandi Syafriandi; Dony Permana; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/53

Abstract

The COVID-19 pandemic that occurred in Indonesia had a huge impact on the country's economy. One of the solutions set by the government in dealing with COVID-19 is to use APBD funds for social assistance in the form of cash, namely "Village Direct Cash Assistance" (BLT DD). With the hope that the people affected by COVID-19 can be helped by this assistance. There are several problems in the distribution of social assistance, one of which is recipients who are not on target. Therefore, it is necessary to use methods to correctly classify recipients. This study uses the Naïve Bayes method to classify people who receive and do not receive aid. From the results obtained on the confussion matrix, the people who received BLT DD assistance and were predicted to receive were as many as 33 people/KK, the people who did not receive BLT DD and were predicted not to receive as many as 34 people/KK, the people who received BLT DD and were predicted not to receive as many as 2 people/KK , and people who do not receive BLT DD and are predicted to receive as many as 6 people/families. As for the classification accuracy value obtained using the Naïve Bayes method is 89%, while the error rate obtained is 11%.
Modeling Human Development Index in Papua and West Sumatera with Multivariate Adaptive Regression Spline Yulia Pertiwi; Dony Permana; Nonong Amalita; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/54

Abstract

The Human Development Index (HDI), is an indicator of the successful development of the quality of human life. The high value of HDI, shows the better development of a region. The purpose of this study is to model and determine the factors affect HDI in Papua Province and West Sumatera Province, using Multivariate Adaptive Regression Spline (MARS). MARS is one of the modeling methods that can handle high-dimensional data. The result of this study showed that the best MARS model for Papua Province is a combination of (BF=24, MI=2, and MO=0) with a minimum GCV value of 0.55953. while the best MARS model for West Sumatera Province is a combination of (BF=24, MI=2, and MO=0) with a minimum GCV value of 0.02697. Based on the model, the factors that significantly affect HDI in Papua Province and West Sumatera Province are average years of schooling (X2), adjusted per-capita income (X6), life expectancy (X1), percentage of poor people (X4), and gross regional domestic product (X3). The percentage level of importance of each variable for Papua Province is 100%, 45.26%, 29.24%, 6.55%, and 6.27%. Meanwhile, for West Sumatera Province it is 100%, 96.73%, 57.54%, 34.13%, and 29.6%, respectively. So in this case, based on the results of the study, the average years of schooling (X2) is the variable that most influences HDI in the two regions, with an importance level of 100%.  
Comparison Fuzzy Time Series Cheng and Ruey Chyn Tsaur Model for Forecasting Sales at Empat Saudara Store Muhammad Alif Yustin; Zilrahmi; Atus Amadi Putra; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/56

Abstract

Trading business is a type of business that focuses on buying goods and reselling them with the aim of making a profit without making changes to the condition of the goods being sold. The problem that often occurs at the Empat Saudara Store is excess or deficiency in the stock of goods owned, where consumer demand is high but goods are insufficient and consumer demand is low but goods are available. One effort to overcome these problems is to make stable sales happen by forecasting to find out future sales. Forecasting is an activity that aims to estimate or predict what will happen in the future by using historical data from the past. The research method used is Fuzzy Time Series (FTS) because this method's forecasting system is to capture patterns from past data and then use it to project future data based on linguistic values. FTS models used are FTS Cheng and FTS Ruey Chyn Tsaur. The five-period forecasting results for FTS Cheng are 200,668.2 , 171,761.5 , 222,412.6 , 214,507.4 , 216,294.3 and for the FTS Ruey Chyn Tsaur model are 198,600 , 229,094.2 , 202,203.05, 230,804.80 ,6. With a MAPE value of the FTS Cheng model of 9.904% and a MAPE value of the FTS Ruey Chyn Tsaur model of 14.01%. From the forecasting results it can be concluded that the FTS Cheng model is better than the FTS Ruey Chyn Tsaur model in predicting sales at the Empat Saudara Store.
Application of singular spectrum analysis method to forecast rice production in west sumatra: Artikel nazifatul azizah Nazifatul Azizah; Fadhilah Fitri; Dodi Vionanda; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/58

Abstract

The imbalance between the population and rice production will cause various negative impacts such as food crises and increasing poverty, so forecasting needs to be done to maintain food availability in the future. This study aims to determine the results of rice production in West Sumatra Province for 12 periods in 2023 using the SSA method. Based on the results of the analysis, rice production in 2023 for 12 periods tends to decrease compared to the previous year. Forecasting rice production using the SSA method with L=21 can be said to be accurate with a MAPE obtained of 17.69%.
Analysis of Factors Influencing the Population Growth Rate in West Sumatra Using Geographically Weighted Logistic Regression Rizqia Salsabila; Atus Amadi Putra; Nonong Amalita; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/59

Abstract

The model of Geographically Weighted Logistic Regression (GWLR) was the development of a model of logistic regression that was implemented to data in spatial. GWLR model parameter estimation was carried out at each location for observation using spatial weighting. The research purposes was to reveal the GWLR model on the dichotomous data of the Population Growth Rate (PGR) indicator in each Districts/Cities in West Sumatra in 2020 and learn more factors that influence the probability that the population growth rate will increase in 19 Districts/Cities in West Sumatra in 2020. The parameters estimation of the GWLR model uses the Maximum Likelihood Estimation (MLE) method. Spatial weighting for parameter estimation is determined using the Fixed Gaussian Kernel weighting function and determining the optimal bandwidth using Akaike's Information Citerion (AIC) criteria. The variable of response that is categorical in this study is the rate of population growth in each districts/cities in West Sumatra in 2020 and the predictor variables are the couples number of childbearing age, the live births number, the in-migration number, and the out-migration number. The reseacrh result obtained from research were that the GWLR model is better than the logistic regression model and 4 groups of Districts/Cities are formed based on factors that affect the increase in population growth rate.
Grouping Level of Poverty Based on District/City in Indonesia Using K-Harmonic Means nabillah putri; Nonong Amalita; Dodi Vionanda; Dony Permana
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/60

Abstract

Indonesia still has a relatively high poverty rate, although nationally it has declined in recent years. There are areas that are still experiencing increasing poverty rates. So that the currently planned poverty alleviation plans are no longer uniform, but need to pay attention to the conditions of each dimension that cause poverty in an area, so it is necessary to group districts/cities in Indonesia on poverty. Grouping was performed using K-Harmonic Means analysis. K-Harmonic Means is a non-hierarchical clustering that takes the average of the harmonic distance between each data point and the cluster’s center. The data used in this research is secondary data sourced from BPS publications on poverty and inequality in 2022. The analysis technique is carried out by standardizing the data, conducting cluster analysis, and validating clusters. Based on the results of the K-Harmonic Means analysis, the optimal number of clusters is two clusters that first cluster has 54 districts/cities while second cluster has 460 districts/cities and the Dunn Index value for cluster validation is 0,03492. So that a better grouping level of poverty based on district/city in Indonesia is obtained by using the K-Harmonic Means method with p = 2,25.
Grouping The Regencies/Cities in Indonesia Based on Expenditure Groups Inflation Value Using DBSCAN Method Meliani Putri; Dony Permana; Syafriandi Syafriandi; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/61

Abstract

The different characteristics of each regencies/cities in Indonesia can trigger differences in expenditure groups inflation value, the differences that occur will affect Indonesia’s national inflation. The purpose of this research is to create groups of regencies/cities based on expenditure groups inflation value and to identify the characteristics of the resulting groups. DBSCAN is a density-based non-hierarchical cluster method that can be used in data conditions that contain noise. The data used in this study is secondary data obtained from the publication of the Badan Pusat Statistik Republic of Indonesia (BPS RI) regarding expenditure groups inflation value. The analysis includes outlier detection, grouping using the DBSCAN method, performing cluster validation with silhouette coefficient, and identifying the characteristics of the clusters formed. Based on the grouping that has been done, two clusters are produced with a silhouette coefficient value of 0.65. The resulting cluster is cluster 0 in the form of a noise cluster consisting of 3 regencies/cities with regencies/cities that have a high category expenditure groups inflation value. Cluster 1 consisting of 87 regencies/cities is a cluster with regencies/cities that have a low category expenditure groups inflation value.
Geographically Weighted Panel Regression Modeling on Human Development Index in West Sumatra Amelia Fadila Rahman; Syafriandi Syafriandi; Nonong Amalita; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/63

Abstract

  The Human Development Index (HDI) is an important issue that has a negative impact on the field of human development and people's welfare in West Sumatra Province. The HDI is being attempted to be solved by identifying the contributing components. Geographically Weighted Panel Regression (GWPR) is a technique that can be used to find influencing factors and explain the influence of characteristic areas of observation. GWPR is a combination of panel data regression method with GWR which is used when the data has the influence of spatial heterogeneity. The purpose of this study is to form a GWPR model that will be applied to the HDI in Regencies/Cities in West Sumatera from 2019 to 2022. Modeling using GWPR Fixed Effect Model. With a minimum CV of 0,000208, the wighter function utilized is a fixed exponential kernel. The findings demonstrated that the model obtained had an of 99.9%, meaning the predictor variable could account for the model by this percentage. Variables that have a significant on HDI are Life Expectancy, Expected Years of Schooling, Mean Years of Schooling, and Purchasing Power Parity.

Page 3 of 21 | Total Record : 202