cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 213 Documents
Geographically Weighted Panel Regression for Modeling The Percentage of Poor Population in West Sumatra Jimmi Darma putra; Dina Fitria; Dodi Vionanda; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/64

Abstract

Geographically Weighted Panel Regression (GWPR) model applies panel regression to spatial data, and parameter estimation is carried out using spatial weight at each observation point. The purpose of this study is to determine the GWPR model and the factors that influence the percentage of poor people in each district/city in West Sumatra Province from 2015 to 2021. And the adaptive bisquare kernel function was used to provide spatial weighting, and Cross-Validation (CV) criteria were used to identify the optimal bandwidth. The research data was secondary data sourced from the official website and West Sumatra published books in Sumatera Barat Dalam Angka from 2015 to 2021. The GWR model and the FEM panel data regression model are combined to create the GWPR model. The results of this study is there are a differences between models and factors that affecting the poor percentages in 19 districts/cityes of West Sumatra.
Comparison of Queen Contiguity and Customized Weighting Matrices on Spatial Regression to Identify Factors Impacting Poverty in East Java Gezi Fajri; Syafriandi Syafriandi; Nonong Amalita; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/67

Abstract

Poverty is crucial problem that negative impact on all sectors, including economic, social, and cultural development in East Java Province. Poverty can also increase unemployment, crime, trigger social disasters and hinder progress East Java province. One efforts overcome problem of poverty in East Java province is detect factors that influence. This effort can be done through statistical modeling to determine factors that influence poverty in East Java province. statistical model that can identify factors that influence poverty and explain relationship between region and surrounding area is spatial regression analysis. In spatial regression analysis, spatial weighting matrix is needed to determine spatial influences between regions where one region influences neighboring regions. spatial weighting matrices that is often used is queen contiguity, and according to Anselin (1988:20), this spatial weighting also considers initial information, purpose of case studied, and theory underlying the research. This weighting uses social and economic variables case under study, namely customized weighting matrix. Based on results of this study, shows that best spatial regression and spatial weighting models are General Spatial Model (GSM) with customized weighting because customized weighting produces better estimation results than SAR, SEM and GSM models with queen contiguity weighting in district and city poverty modeling in East Java province with Akaike Infomation Criterion (AIC) value of 188.77 and detemination coefficient (R2) of 84.95%. School's Expected Time, Life Expectancy Score, and Employment Participation Rate are factors that will have substantial impact on percentage of population living in poverty East Java's districts and cities in 2021.
Comparison of the Chen and Sinsgh’s Fuzzy Time Series Methods in Forecasting Farmer Exchange Rates in Indonesia Okia Dinda Kelana; Atus Amadi Putra; Nonong Amalita; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/36

Abstract

Chen and Singh's Fuzzy Time Series Model is a forecasting method that uses the basi fuzzy logic in the process. The differences in the models are in the fuzzy logic relations. Chen's model uses Fuzzy Logical Relationship Groups. Meanwhile, the Singh model uses only Fuzzy Logical Relationships in the forecasting process. To find out the best model between the two models, forecasting the Farmer's Exchange Rate is carried out. Farmers' exchange rates are the option for observers of agricultural development in assessing the level of welfare of farmers in Indonesia. With changes in farmer exchange rates every month, it is necessary to forecast data in order to obtain an overview for the following month. Research used is applied research where the initial step is to study and analyze the theories related to our research, then colect the necessary data. The data used is data secondary data obtained online from the official website of the Badan Pusat Statistika (BPS). the forecasting results of the two models were compared using MAPE. The results of the comparison of the accuracy of the prediction accuracy of Chen and Singh's fuzzy time series models on farmers' exchange rates obtained Chen's MAPE fuzzy time series values ​​of 0.679% and Singh's fuzzy time series models of 0.354%. This means that the best forecasting model for farmer exchange rates in Indonesia is the Singh model.
Pemodelan Waktu Survival Pasien Tuberkulosis menggunakan Regresi Cox Proportional Hazard dengan Data Tersensor Elsa Oktaviani; Nonong Amalita; Atus Amadi Putra; Dony Permana
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/65

Abstract

Tuberculosis is an infectious disease that needs to be watched out for in West Sumatra Province. West Sumatra Province is the province with the 12th highest TB case in Indonesia in 2021 with a total of 8,216 TB cases and a TB treatment cure rate that is still far from the target of the Indonesian Ministry of Health. The purpose of this study is to determine the Cox proportional hazard regression model and factors that affect the survival time of tuberculosis patients at Dr. M. Djamil Padang Hospital. The survival period used is the time when the patient is taking TB treatment at RSUP Dr.  M. Djamil Padang in 2021 until the patient is declared dead. The method used in the Cox Proportional Hazard Regression analysis is the Maximum Partial Likelihood Estimation Method. By using the cox proportional hazard regression model, the factors that influence the survival time of tuberculosis patients at RSUP Dr.  M. Djamil's BMI , leukocytes , fever , shortness of breath , and decreased appetite . 
Sentiment Analysis of Electric Cars Using Naive Bayes Classifier Method NURUL AFIFAH; Dony Permana; Dodi Vionanda; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/68

Abstract

In recent years, electric cars have become increasingly popular as an alternative to environmentally friendly vehicles in the automotive industry. These vehicles use electric power as an energy source that can mitigate the reliance on fossil fuels contribute to efforts to minimize greenhouse gas emissions and air pollution. However, the presence of electric cars raises pro and con opinions from the public. the conversation about electric cars has become one of the hot on social media. Twitter is a social media microblogging that permits its users to create short messages and share them easily and quickly. These opinions require sentiment analysis. The purpose of conducting sentiment analysis is to find out how people's perceptions and opinions on electric cars are leading in a favorable or unfavorable direction. Thus, sentiment analysis can help companies marketing strategies, and better business decisions. Then the opinions will be classified based on positive and negative categories. This investigation employs the naive classifier method to generate positive and negative sentiment towards electric cars on Twitter. The accuracy results of naive bayes obtained by using a confusion matrix in this research are 77.8%, with a dataset split composition of 70%:30%.
Comparison of Error Rate Prediction Methods in Classification Modeling with Classification and Regression Tree (CART) Methods for Balanced Data Fitria Panca Ramadhani; Dodi Vionanda; Syafriandi Syafriandi; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/73

Abstract

CART (Classification and Regression Tree) is one of the classification algorithms in the decision tree method. The model formed in CART is a tree consisting of root nodes, internal nodes, and terminal nodes. After the model is formed, it is necessary to calculate its accuracy. The aim is to see the performance of the model. The accuracy of this model can be determined by calculating the predicted error rate in the model. The error rate prediction method works by dividing the data into training data and testing data. There are three methods in the error rate prediction method: Leave One Out Cross Validation (LOOCV), Hold Out (HO), and K-Fold Cross Validation. These methods have different performance in dividing data into training data and testing data, so there are advantages and disadvantages to each method. Therefore, a comparison was made between the three error rate prediction methods with the aim of determining the appropriate method for the CART algorithm. This comparison was made by considering several factors, for instance, variations in the mean, the number of variables, and correlations in normally distributed random data. The results of the comparison will be observed using a boxplot by looking at the median error rate and the lowest variance. The results of this study indicate that the K-Fold Cross Validation method has the lowest median error rate and the lowest variance, so the most suitable error prediction method for the CART method is the K-Fold Cross Validation method
Comparasion of Error Rate Prediction Methods of C4.5 Algorithm for Balanced Data Ichlas Djuazva; Dodi Vionanda; Nonong Amalita; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/74

Abstract

C4.5 is a highly effective decision tree algorithm for classification purposes. Compared to CHAID, Cart, and ID3, C4.5 generates the decision tree faster and is easier to understand. However, C4.5 algorithm is also not exempt from errors in classification, which can impact the accuracy of the resulting model. Model accuracy could be measured by predicting the error rate. One commonly used method for error rate prediction is cross-validation. The cross-validation method divides data into two parts: training set to build model and testing set to test the model. There are several cross-validation techniques commonly used to predict the error rate, such as Leave One Out Cross Validation (LOOCV), Hold Out (HO), and k-fold cross-validation. LOO has unbiased estimation but takes a long time and depends on the data size; HO could avoid overfitting and work faster; and k-folds cross validation has a smaller error rate prediction.   This study uses artificially generated data with a normal distribution, including univariate, bivariate, and multivariate datasets with various combinations of mean differences and different correlations. Different correlation structures are applied to see the impact of these different correlations on the error rate prediction method. Considering these factors, this research focuses on comparing three cross-validation methods to predict error rates for the decision tree model generated by C4.5 algorithm. This research found that k-folds cross-validation is the most suitable cross-validation method to apply when testing the model generated by C4.5 algorithm with balanced data
Comparison of Fuzzy Time Series Markov Chain and Fuzzy Time Series Cheng to Predict Inflation in Indonesia Ihsanul Fikri; Admi Salma; Dodi Vionanda; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/76

Abstract

Inflation is one of the main microeconomic problems which is a very important economic indicator. Unstable inflation has a negative impact on people’s welfare, thus controlling inflation is important thing for a country. Forecasting is needed to monitor future movements in the inflation rate. In this study, the Fuzzy Time Series Markov Chain and fuzzy time series Cheng methods will be compared in forecasting inflation. The advantage of the fuzzy time series method is that it does not have any special assumptions thet must be met. The purpose of this study is to determine the results of forecasting based on the results of the comparison of the two methods. The result of the comparison of the two methods based on the MAPE value is that fuzzy time series Markov Chain has the smallest value of 6,97%. The result of inflation forecasting for the next 5 periods using the fuzzy time series Markov Chain method is 5,42; 5,71; 5,95; 5,82 and 6,10.
Step Function Intervention Analysis Model to Estimate Number of Aircraft Passengers in Minangkabau International Airport Velya Rahma Putri; Zilrahmi; Syafriandi Syafriandi; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/77

Abstract

Pandemic of Covid-19 had a quite big impact in air transportation. Minangkabau International Airport (BIM) has also felt the impact of this pandemic, namely a drastic decrease in the number of airplane passengers or there was an intervention event.a stable of airplane passengers is needed to indicate a stable economy in the transportation sector. If there are no passengers or flight activity in an area, it means that there are no entry and exit of economic activities, industrial activities, tourism and trade which help economic development. For this reason, it is necessary to do forecasting so that the problems that arise as a result of the drastic decline can be resolved by making new policies. Forecasting was carried out in this study to obtain an intervention model that will be used for forecast the next 12 months and predict how long the effect of the intervention will last for avoid further losses due to the continued decline in the number of passengers. The intervention model is considered better for data that has intervention variable compared to SARIMA models. The results of forecasting showed that the SARIMA model (0,1,1)(1,1,1)12 b = 0, s = 8, r = 1 is the best model that can be used for forecasting data containing interventions. This is evidenced by the small MAPE of 36.34% so that the model is feasible to use because the accuracy is quite high and close to the actual value.
Analysis of the Poverty Level Model for West Sumatra Province Using Geographically Weighted Binary Logistic Regression april leniati; Dony Permana; Nonong Amalita; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/80

Abstract

T   West Sumatra Province (West Sumatra) ranks third lowest in terms of the poverty rate on the island of Sumatra in 2022, with a figure of 5.92%. Although this figure is lower than the national average, the Province of West Sumatra is targeting a reduction in the poverty rate to 5.62% in 2024 in the vision of the 2021–2026 Regional Development Plan. The purpose of this study is to analyze the factors that contribute to the poverty rate in West Sumatra Province based on geography in 2022. The method used to address poverty problems is Geographically Weighted Binary Logistic Regression (GWBLR), which takes geographical influences into account in the analysis. This study uses data on the percentage of poor people (Y) and the influencing factors, namely life expectancy (X1), literacy rate (X2), labor force participation (X3), and economic growth (X4). The results showed that based on the lowest Akaike Information Criterion Corrected (AICc) value, the GWBLR model with a Fixed Gaussian Kernel weight is the best at modeling the problem of poverty in West Sumatra in 2022. According to the model, the life expectancy variable will have a significant impact on the level of poverty in 13 districts and cities in West Sumatra Province in 2022.

Page 4 of 22 | Total Record : 213