cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 8 Documents
Search results for , issue "Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science" : 8 Documents clear
Comparison of Forecasting Using Fuzzy Time Series Chen Model and Lee Model to Closing Price of Composite Stock Price Index Mohammad Reza febrino; Dony Permana; syafriandi; Nonong Amalita
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (894.218 KB) | DOI: 10.24036/ujsds/vol1-iss2/22

Abstract

Investment is an activity to invest with the hope that someday you will get a number of benefits from theinvestment result. In investing, analyzing is important to see the current situation and condition of stock. Investorscan forecast stock prices by looking at trends based on data movements from stock prices in the past. Fuzzy TimeSeries (FTS) was used in this study to forecast. Fuzzy time series is a forecasting technique that uses patterns frompast data to project future data in areas where linguistic values are formed in the data. This study compares theclosing price of composite stock forecasting using the fuzzy time series chen and lee models. The JCI's closing pricefor the following period is 6,904 and has a Mean Absolute Percentage Error (MAPE) of 4.03%, according to the chenfuzzy time series method. In contrast, utilizing Lee's fuzzy time series method, the predicted JCI closing price for thefollowing period is 7,046, with a MAPE value of 3.10 percent. It can be concluded from the forecasting results of theChen and Lee methods that the Lee model FTS is superior to the Chen model FTS in predicting the JCI closing price.
Multivariate Adaptive Regression Spline Method for Study Timeliness of the 2017 FMIPA UNP Student Rahmadani Iswat; Fadhilah Fitri; Atus Amadi Putra; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1120.783 KB) | DOI: 10.24036/ujsds/vol1-iss2/23

Abstract

The punctuality of study is the time period to complete an education, for undergraduate students is 4 years. One of the quality’s determining of higher education is students’ ability to complete their education on time. The purpose of this study is to see the best modeling results and the accuracy of the punctuality of study of class 2017 FMIPA UNP undergraduate students using MARS. MARS is a method of multivariate nonparametric regression between response variables and predictor variables. The type of research used is applied research. The predictor variables used in this study are Grade Point Average (GPA), gender, university entrance, major, school origin status and place of origin. While the response variable is punctuality of learning time. The results of trial and error showed that the best model was obtained from a combination (BF = 18, MI = 3 and MO = 2), with a minimum GCV value of 0.23182 and R2 value of 0.10045. From the model, it can be seen that the factors that significantly affect punctuality of learning time for FMIPA UNP students class 2017 are the X4 (majors) with an importance level of 100%, the X1 (GPA) with an importance level of 96.61%, X3 (university entrance) and the X5 (school origin status) with an importance level of 16.78 %. The classification accuracy on the 2017 student study timeliness is 64% based on graduating on time and not on time, with a classification error rate of 36%.
Comparison of Naive Bayes Method and Binary Logistics Regression on Classification of Social Assistance Recipients Program Keluarga Harapan (PKH) Fanni Rahma Sari; Fadhilah Fitri; Atus Amadi Putra; Dony Permana
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1276.163 KB) | DOI: 10.24036/ujsds/vol1-iss2/24

Abstract

Population density is one of the causes of economic inequality in society. One of the solutions provided by the government is to distribute social assistance. In 2007 the government created a social assistance program called the “Program Keluarga Harapan” (PKH) with the aim of alleviating poverty. There are several problems in the distribution of social assistance, one of which is receiving aid that is not right on target. Therefore, an appropriate method is needed in classifying the recipients of social assistance properly. This study will use two methods, namely Naive Bayes and Binary Logistic Regression to compare which method is better on the data used. The data used is the DTKS data for PKH assistance recipients in the Anduring Village in 2020. Based on the results obtained, the accuracy of the Naive Bayes method is 70% and Binary Logistic Regression is 73%. So the best method in measuring classification is Binary Logistic Regression.
Comparison of the Performance of the K-Means and K-Medoids Algorithms in Grouping Regencies/Cities in Sumatera Based on Poverty Indicators Mardhiatul Azmi; Atus Amadi Putra; Dodi Vionanda; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1100.498 KB) | DOI: 10.24036/ujsds/vol1-iss2/25

Abstract

K-Means is a non-hierarchical approach that separates data into a number of groups according on how far an object is from the closest centroid. K-Medoids is a non-hierarchical clustering technique that separates data into a number of groups according on how far away an object is from the closest medoid. The two approaches were put to the test using data on poverty in Sumatra in 2021, when the Covid-19 outbreak had caused the poverty rate to increase from the year before. This research is an applied research which begins by studying relevant theories. The data used in this study is secondary data sources from the BPS website regarding poverty indicators. This study aims to determine regional groups and compare the results of grouping with the k-means and k-medoids methods. To find out the best performance between the two methods, that is by looking at the lowest Davies Bouldin Index (DBI). The results of this study are the k-means algorithm produces as many as 34 districts/cities incorporated in cluster 1, 52 districts/cities in cluster 2, 23 districts/cities in cluster 3, and 45 districts/cities in cluster 4. k-medoids, namely in clusters 1, 2, 3, and 4, respectively, as many as 53, 40, 37, and 24 districts/cities. Based on the results of the grouping, the DBI k-means of 1,584 and k-medoids of 2,359 were obtained. This means that the k-means algorithm is better than the k-medoids, because the k-means DBI is smaller than the k-medoids.
Comparison of Naïve Bayes and K-Nearest Neighbor for DKI Jakarta Air Pollution Standard Index Classification Nurdalia; Zilrahmi; Dony Permana; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (817.962 KB) | DOI: 10.24036/ujsds/vol1-iss2/29

Abstract

Data mining is the process of extracting and searching for useful knowledge and information using certain algorithms or methods according to knowledge or information. The data mining classification methods used in this study are Naïve Bayes and K-Nearest Neighbor. By using the Naïve Bayes and K-Nearest Neighbor methods, it is possible to classify the DKI Jakarta air pollution standard index in 2021 based on six air pollutants, namely dust particles (PM10), dust particles (PM2.5), sulfur dioxide (SO2), carbon monoxide. (CO), ozone (O3) and nitrogen dioxide (NO2). The test was carried out to determine the accuracy in predicting the DKI Jakarta air pollution standard index in 2021 using the confusion matrix evaluation value. So that the best performance of the two methods is found in the Naïve Bayes algorithm with high Naïve Bayes sensitivity values ​​for all categories even though there are data in minority or unbalanced categories, and the frequency of data from each category or in this case the data is not balanced, the Naïve Bayes algorithm shows good performance in accuracy, sensitivity, specificity.
Application of Random Forest for The Classification Diabetes Mellitus Disease in RSUP Dr. M. Jamil Padang FAZHIRA ANISHA; Dodi Vionanda; nonong amalita; zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1284.471 KB) | DOI: 10.24036/ujsds/vol1-iss2/30

Abstract

Diabetes Mellitus is a disease in which blood sugar levels go beyond normal (GDS>200 mg/dl). Diabetes Mellitus may be defined as an insulin function disorder in the pancreatic organ. Diabetes Mellitus is a world health problem as incidents of this disease are increasing in every part of the world, including Indonesia. Prevention and control of the disease need to be made so as not to cause complications in other organs even to death. Because of this, one needs to study a method to predict the occurance of this disease and to knows the variable that most affect a person suffered from it. This could be accomplished by using a classification methods. One of classification methods is Random Forest. In this case study using randomForest packages in RStudio software. In general, the result of this study are the smallest OOB’s error rates (%) and Variable Importance Measure (VIM) using Mean Decrease Accuracy (MDA) and Mean Decrease Gini (MDG) values.The classification by a Random Forest methods on the incidence of Diabetes Mellitus in RSUP Dr. M. Jamil Padang results in OOB’s error rate was 1,2% or accuracy rates was 98,8%. The most optimal model produced using mtry = 4 and ntree = 1000. If used MDA, the variables that most affect are Age, Polyphagia, Polyuria, HB, and BMI. While if used MDG, the variables that most affect are Age, Polyphagia, BMI, HB, and Delayed Healing.
Application of Random Forest to Identify for Poor Households in West Sumatera Province Febri Ramayanti; Dodi Vionanda; Dony Permana; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1438.133 KB) | DOI: 10.24036/ujsds/vol1-iss2/31

Abstract

Poverty is a socioeconomic problem in Indonesia. The number of people who were living in poverty in West Sumatera increases for 26.44 thousands from 2020 to 2021. The government has created programs to cope with poverty by taking into account the criteria for the poor households. These criteria have been developed by using the data obtained through The National Socioeconomic Survey (Susenas). However, instead of.showing the actual location of poor household, the existing data only interprets the number of poor household. Thus make the program less effective. This could be overcome by classification analysis of random forest (RF). RF is collection of many decision trees. Before fitting RF, one has to determine the values if three tuning parameters, mtry, ntree and node size. The result are the smallest OOB’s error rate (%) and Variable Importance Measure(VIM). The classification by RF in this research results in OOB’s error rate was 5.65% or accuracy rate was 94.35% with tuning parameter using mtry=5 and ntree=500. Based on the VIM, the poor household’s criteria include sources of drinking water such as protected or unprotected spring water and surface water, lighting tools such as non-PLN electricity or no usage of electricity, fuel for cooking such as charcoal and firewood, and the head of the household being self-employed, a family worker, or unpaid with at least a junior high degree.
Nonparametric Regression Modeling with Fourier Series Approach on Poverty Cases in West Sumatra Province Melin Wanike Ketrin; Fadhilah Fitri; Atus Amadi putra; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1221.941 KB) | DOI: 10.24036/ujsds/vol1-iss2/32

Abstract

Poverty is a complex problem that has an impact on various social problems such as education, unemployment, health and economic growth. Therefore, the problem of poverty is important to overcome in order to create population welfare. One of the analyses that can be used to model the percentage of poverty is regression analysis. Regression analysis is divided into two approaches, namely parametric and nonparametric. Parametric regression has several assumptions while, the only assumption nonparametric regression shape of the curve does not form a certain pattern. There are several approaches to nonparametric regression, one of which is the Fourier Series. The purpose of this study is to model the percentage of poverty in West Sumatra Province. The unclear shape of the curve in the data used is a consideration for using nonparametric regression. Then it is known that the data used in this study is data per region which tends to have a fluctuating nature. So it is suitable to use the Fourier series approach. In this research, nonparametric regression modeling with one, two, and three oscillation parameters was attempted. The best model was obtained which consisted of two oscillation parameters with a Generalized Cross Validation (GCV) value of 2.110 and R² of 92.44%.

Page 1 of 1 | Total Record : 8