Claim Missing Document
Check
Articles

Comparison Fuzzy Time Series Cheng and Ruey Chyn Tsaur Model for Forecasting Sales at Empat Saudara Store Muhammad Alif Yustin; Zilrahmi; Atus Amadi Putra; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/56

Abstract

Trading business is a type of business that focuses on buying goods and reselling them with the aim of making a profit without making changes to the condition of the goods being sold. The problem that often occurs at the Empat Saudara Store is excess or deficiency in the stock of goods owned, where consumer demand is high but goods are insufficient and consumer demand is low but goods are available. One effort to overcome these problems is to make stable sales happen by forecasting to find out future sales. Forecasting is an activity that aims to estimate or predict what will happen in the future by using historical data from the past. The research method used is Fuzzy Time Series (FTS) because this method's forecasting system is to capture patterns from past data and then use it to project future data based on linguistic values. FTS models used are FTS Cheng and FTS Ruey Chyn Tsaur. The five-period forecasting results for FTS Cheng are 200,668.2 , 171,761.5 , 222,412.6 , 214,507.4 , 216,294.3 and for the FTS Ruey Chyn Tsaur model are 198,600 , 229,094.2 , 202,203.05, 230,804.80 ,6. With a MAPE value of the FTS Cheng model of 9.904% and a MAPE value of the FTS Ruey Chyn Tsaur model of 14.01%. From the forecasting results it can be concluded that the FTS Cheng model is better than the FTS Ruey Chyn Tsaur model in predicting sales at the Empat Saudara Store.
Application of singular spectrum analysis method to forecast rice production in west sumatra: Artikel nazifatul azizah Nazifatul Azizah; Fadhilah Fitri; Dodi Vionanda; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/58

Abstract

The imbalance between the population and rice production will cause various negative impacts such as food crises and increasing poverty, so forecasting needs to be done to maintain food availability in the future. This study aims to determine the results of rice production in West Sumatra Province for 12 periods in 2023 using the SSA method. Based on the results of the analysis, rice production in 2023 for 12 periods tends to decrease compared to the previous year. Forecasting rice production using the SSA method with L=21 can be said to be accurate with a MAPE obtained of 17.69%.
Analysis of Factors Influencing the Population Growth Rate in West Sumatra Using Geographically Weighted Logistic Regression Rizqia Salsabila; Atus Amadi Putra; Nonong Amalita; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/59

Abstract

The model of Geographically Weighted Logistic Regression (GWLR) was the development of a model of logistic regression that was implemented to data in spatial. GWLR model parameter estimation was carried out at each location for observation using spatial weighting. The research purposes was to reveal the GWLR model on the dichotomous data of the Population Growth Rate (PGR) indicator in each Districts/Cities in West Sumatra in 2020 and learn more factors that influence the probability that the population growth rate will increase in 19 Districts/Cities in West Sumatra in 2020. The parameters estimation of the GWLR model uses the Maximum Likelihood Estimation (MLE) method. Spatial weighting for parameter estimation is determined using the Fixed Gaussian Kernel weighting function and determining the optimal bandwidth using Akaike's Information Citerion (AIC) criteria. The variable of response that is categorical in this study is the rate of population growth in each districts/cities in West Sumatra in 2020 and the predictor variables are the couples number of childbearing age, the live births number, the in-migration number, and the out-migration number. The reseacrh result obtained from research were that the GWLR model is better than the logistic regression model and 4 groups of Districts/Cities are formed based on factors that affect the increase in population growth rate.
Comparing Classification and Regression Tree and Logistic Regression Algorithms Using 5×2cv Combined F-Test on Diabetes Mellitus Dataset Fashihullisan; Dodi Vionanda; Yenni Kurniawati; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/84

Abstract

Classification is the process of finding a model that describes and distinguishes data classes that aim to be used to predict the class of objects whose class labels are unknown. There are several algorithms in classification, such as classification trees and regression trees (CART) and logistic regression. The k-fold cross validation method has a weakness for algorithm comparison problems it is possible at different folds to produce different error predictions, so that the results of comparing algorithm performance will also be different. There for in the problem of comparison of algorithms, the researcher will apply the 52cv t test method and the 52cv combined F test. Out of 100 iterations the 10-fold cross validation method was only consistent three times which shows that the k-fold cross validation method has poor consistency in comparing the CART algorithm and logistic regression for diabetes mellitus data. In addition, 52cv combined F test and 52cv t test methods that have been carried out show that 52cv combined F test is better used to get conclusions from the results of a comparison of the two algorithms because it only produces one decision, in contrast to 52cv t test which has the possibility to get different decisions from 10 test statistics which results makes it difficult for researchers to draw conclusions in comparing the cart algorithm and logistic regression
Emprical Study for Algorithms Comparison of Classification and Regression Tree and Logistic Regression Using Combined 5×2cv F Test Fayza Annisa Febrianti; Dodi Vionanda; Yenni Kurniawati; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/85

Abstract

Classification is a method to estimate the class of an object based on its characteristics. Several learning algorithms can be applied in classification, such as Classification and Regression Tree (CART) and logistic regression. The main goal of classification is to find the best learning algorithm that can be applied to get the best classifier. In comparing two learning algorithms, a direct comparison by seeing the smaller prediction error rate may be possible when the difference is very clear. In this case, direct comparison is misleading and resulting inadequate conclusions. Therefore, a statistical test is needed to determine whether the difference is real or random. The results of the 5×2cv paired t-test sometimes reject and sometimes fail to reject the hypothesis. It is distracting because the changing of the error rate difference should not affect the test result. Meanwhile, the overall results of the combined 5×2cv F test show that the tests fail to reject the hypothesis. This indicates that CART and logistic regression perform identically in this case.
Sentiment Analysis of Prabowo Subianto as 2024 Presidential Candidate on Twitter Using K-Nearest Neighbor Algorithm Aurumnisva Faturrahmi; Zamahsary Martha; Yenni Kurniawati; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/101

Abstract

The presidential election is one of the most talked topics at this moment. Based on many surveys, Prabowo Subianto is one of the strongest candidates for the upcoming 2024 presidential election. This research aims to see how the public sentiment towards Prabowo Subianto as the presidential candidate tends to be positive or negative. Sentiment classification was conducted using the K-Nearest Neighbor (KNN) algorithm. This algorithm classifies sentiment based on the k value of the nearest neighbor. This analysis was conducted in several stages such as data collection, text preprocessing, data labelling, data classification using the KNN algorithm, and evaluating the accuracy of the model in classifying sentiment. In this research, the results of the sentiment classification were 2731 positive sentiments and 76 negative sentiments. Where the accuracy rate produced by the model using the value of k = 3 on the division of training data and testing data of 80:20 is 97,33%.
Fuzzy Geographically Weighted Clustering Analysis for Sectoral Potential Gross Regional Domestic Product in West Sumatera Syifa Nabilah Wandira; Zilrahmi; Syafriandi Syafriandi; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/109

Abstract

Gross Regional Domestic Product (GRDP) is the sum of the added value of all goods and services produced or produced in an area that arises as a result of various economic activities in a certain period. Each region certainly has its own advantages and potential, such as in sectors or business fields. GRDP inequality occurs due to differences in geographical conditions and natural resources in each region. The method that can be used to overcome this inequality is cluster analysis. Cluster analysis can group data objects that have the same characteristics so that the inequality that occurs can be seen from the clusters formed. Fuzzy Geographically Weighted Clustering is a clustering method using fuzzy logic which gives a geographic effect to each cluster so that it can better describe the actual cluster situation. The results of  research obtained 3 optimum clusters with different characteristics. Cluster 1 has high potential, cluster 2 has low potential and cluster 3 has medium potential in forming GRDP.
Forecasting the Exchange Rate of Yen to Rupiah Using the Long Short-Term Memory Method Anggi Adrian Danis; Yenni Kurniawati; Nonong Amalita; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/114

Abstract

Long Short-Term Memory (LSTM) is a modification of the Recurrent Neural Network (RNN) to address the problems of exploding and vanishing gradients and make it possible to manage long-term information. To tackle these problems, modifications were made to the RNN by providing memory cells that can store information for long periods. This study aimed to forecast the exchange rate of  Yen to Rupiah using the LSTM method. The data used in this research is daily purchasing rate data from January 2020 to May 2023, which consists of 848 observations. The data was divided into two sets: 80% for training and 20% for testing. For the forecasting process, experiments were conducted to identify the best model by adjusting several hyperparameters. The performance of each model was evaluated using the Mean Absolute Percentage Error (MAPE). According to the experimental results, the best model was the LSTM model with a batch size of 20, 150 epochs, and 50 neurons per layer, which yielded an MAPE value of 1,5399.
Comparison of Error Prediction Methods in Claassification Modeling with CHAID Methods for Balanced Data Findri Wara Putri; Dodi Vionanda; Atus Amadi Putra; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/116

Abstract

Chi-Squared Automatic Interaction Detection (CHAID) is an exploratory method for classifying data by building classification trees. The classification result are displayed in the form of a tree diagram model. After the model is formed, it is necessary to calculate the accuracy of the model. The goal is to see the performance of the model. The accuracy of this model can be determined by calculating the level of prediction error in the model. The error rate prediction method works by dividing data into training data and testing data. There are three methods in the error rate prediction method, such as Leave one out cross validation (LOOCV), Hold out, and k-fold cross validation. These methods have different performance in dividing data into training data and test data, so that each method has advantages and disadvantages. Therefore, a comparison of the three error rate prediction methods was carried out with the aim of determining the appropriate method for the CHAID. This research is included in experimental research and uses simulation data from data generation results in RStudio. This comparison is carried out by considering several factors, namely the marginal probability matrix and different correlations. The comparison results will be observed using a boxplot by looking at the median error rate and lowest variance. This research found that k-fold cross validation is the most suitable error rate prediction method applied to the CHAID method for balanced data.
Implementation Self Organizing Maps Method In Cluster Analysis Based on Achievement Suistainable Development Goal/SDG’s West Sumatera Province AL Rezki Ivansyah; Fadhilah Fitri; Yenni Kurniawati; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/118

Abstract

Indonesian government's commitment to implementing the Sustainable Development Goals (SDG’s) agenda, particularly in West Sumatra. The government of West Sumatra supports the objectives and targets of achieving SDG’s by optimizing the implementation of SDG indicators in the Rencana Aksi Daerah (RAD) for SDG’s of West Sumatra Province for the years 2022-2026. However, in its execution, there is a need for annual monitoring and evaluation of the RAD for SDG’s in West Sumatra Province. Clustering is employed to serve as a consideration for evaluating the implementation of RAD for SDG’s in West Sumatra Province for the years 2022-2026. The clustering method used is Self Organizing Map (SOM), an effective tool for visualizing high-dimensional data and can be used to map high-dimensional data into one, two, or three dimensions, representing connected units or neurons. The data used consist of 14 SDG indicator variables across 19 regencies/cities in West Sumatra in the year 2022, sourced from the official website and publications of the Badan Pusat Statistika (BPS) of West Sumatra Province. The analysis results in the formation of 3 clusters with different characteristics, which can be used as references in making policy decisions and effective strategies to enhance the implementation performance of SDG’s programs in West Sumatra Province.