Claim Missing Document
Check
Articles

Grouping The Regencies/Cities in Indonesia Based on Expenditure Groups Inflation Value Using DBSCAN Method Meliani Putri; Dony Permana; Syafriandi Syafriandi; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/61

Abstract

The different characteristics of each regencies/cities in Indonesia can trigger differences in expenditure groups inflation value, the differences that occur will affect Indonesia’s national inflation. The purpose of this research is to create groups of regencies/cities based on expenditure groups inflation value and to identify the characteristics of the resulting groups. DBSCAN is a density-based non-hierarchical cluster method that can be used in data conditions that contain noise. The data used in this study is secondary data obtained from the publication of the Badan Pusat Statistik Republic of Indonesia (BPS RI) regarding expenditure groups inflation value. The analysis includes outlier detection, grouping using the DBSCAN method, performing cluster validation with silhouette coefficient, and identifying the characteristics of the clusters formed. Based on the grouping that has been done, two clusters are produced with a silhouette coefficient value of 0.65. The resulting cluster is cluster 0 in the form of a noise cluster consisting of 3 regencies/cities with regencies/cities that have a high category expenditure groups inflation value. Cluster 1 consisting of 87 regencies/cities is a cluster with regencies/cities that have a low category expenditure groups inflation value.
Geographically Weighted Panel Regression Modeling on Human Development Index in West Sumatra Amelia Fadila Rahman; Syafriandi Syafriandi; Nonong Amalita; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/63

Abstract

  The Human Development Index (HDI) is an important issue that has a negative impact on the field of human development and people's welfare in West Sumatra Province. The HDI is being attempted to be solved by identifying the contributing components. Geographically Weighted Panel Regression (GWPR) is a technique that can be used to find influencing factors and explain the influence of characteristic areas of observation. GWPR is a combination of panel data regression method with GWR which is used when the data has the influence of spatial heterogeneity. The purpose of this study is to form a GWPR model that will be applied to the HDI in Regencies/Cities in West Sumatera from 2019 to 2022. Modeling using GWPR Fixed Effect Model. With a minimum CV of 0,000208, the wighter function utilized is a fixed exponential kernel. The findings demonstrated that the model obtained had an of 99.9%, meaning the predictor variable could account for the model by this percentage. Variables that have a significant on HDI are Life Expectancy, Expected Years of Schooling, Mean Years of Schooling, and Purchasing Power Parity.
Data Sharing Technique for Electronic Health Record (EHR) Classification using Support Vector Machine Algorithm Moh. Erkamim; Said Thaufik Rizaldi; Sepriano Sepriano; Khoirun Nisa; Sulhatun Sulhatun; Zilrahmi Zilrahmi; Winalia Agwil
Indonesian Journal of Artificial Intelligence and Data Mining Vol 6, No 1 (2023): Maret 2023
Publisher : Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/ijaidm.v6i1.24794

Abstract

The Electronic Health Record (EHR) integrates information about medical history in patients, complications, and history of drug use efficiently, which demands optimality and speed of service for efficiency and effectiveness of services, especially in determining outpatient and inpatient services on accurate patient history data. In efforts to improve data accuracy, this study combined the c, γ, and degree kernels in the Linear, Polynomial, and Radial Basis Function (RBF) kernels as well as data sharing techniques 10-fold cross-validation, k-medoids, and Hold- out (70 % 30%) resulted in superior K-Medoids data sharing techniques for each Polynomial kernel with an accuracy of 75.76% and a Radial Basis Function (RBF) kernel with an accuracy of 75.56% so that it can be said that the combination of K-Medoids and Polynominal kernel in the algorithm Support Vector Machine (SVM) can be used in this research case
Comparasion of Error Rate Prediction Methods of C4.5 Algorithm for Balanced Data Ichlas Djuazva; Dodi Vionanda; Nonong Amalita; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/74

Abstract

C4.5 is a highly effective decision tree algorithm for classification purposes. Compared to CHAID, Cart, and ID3, C4.5 generates the decision tree faster and is easier to understand. However, C4.5 algorithm is also not exempt from errors in classification, which can impact the accuracy of the resulting model. Model accuracy could be measured by predicting the error rate. One commonly used method for error rate prediction is cross-validation. The cross-validation method divides data into two parts: training set to build model and testing set to test the model. There are several cross-validation techniques commonly used to predict the error rate, such as Leave One Out Cross Validation (LOOCV), Hold Out (HO), and k-fold cross-validation. LOO has unbiased estimation but takes a long time and depends on the data size; HO could avoid overfitting and work faster; and k-folds cross validation has a smaller error rate prediction.   This study uses artificially generated data with a normal distribution, including univariate, bivariate, and multivariate datasets with various combinations of mean differences and different correlations. Different correlation structures are applied to see the impact of these different correlations on the error rate prediction method. Considering these factors, this research focuses on comparing three cross-validation methods to predict error rates for the decision tree model generated by C4.5 algorithm. This research found that k-folds cross-validation is the most suitable cross-validation method to apply when testing the model generated by C4.5 algorithm with balanced data
Comparison of Fuzzy Time Series Markov Chain and Fuzzy Time Series Cheng to Predict Inflation in Indonesia Ihsanul Fikri; Admi Salma; Dodi Vionanda; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/76

Abstract

Inflation is one of the main microeconomic problems which is a very important economic indicator. Unstable inflation has a negative impact on people’s welfare, thus controlling inflation is important thing for a country. Forecasting is needed to monitor future movements in the inflation rate. In this study, the Fuzzy Time Series Markov Chain and fuzzy time series Cheng methods will be compared in forecasting inflation. The advantage of the fuzzy time series method is that it does not have any special assumptions thet must be met. The purpose of this study is to determine the results of forecasting based on the results of the comparison of the two methods. The result of the comparison of the two methods based on the MAPE value is that fuzzy time series Markov Chain has the smallest value of 6,97%. The result of inflation forecasting for the next 5 periods using the fuzzy time series Markov Chain method is 5,42; 5,71; 5,95; 5,82 and 6,10.
Step Function Intervention Analysis Model to Estimate Number of Aircraft Passengers in Minangkabau International Airport Velya Rahma Putri; Zilrahmi; Syafriandi Syafriandi; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 1 No. 4 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss4/77

Abstract

Pandemic of Covid-19 had a quite big impact in air transportation. Minangkabau International Airport (BIM) has also felt the impact of this pandemic, namely a drastic decrease in the number of airplane passengers or there was an intervention event.a stable of airplane passengers is needed to indicate a stable economy in the transportation sector. If there are no passengers or flight activity in an area, it means that there are no entry and exit of economic activities, industrial activities, tourism and trade which help economic development. For this reason, it is necessary to do forecasting so that the problems that arise as a result of the drastic decline can be resolved by making new policies. Forecasting was carried out in this study to obtain an intervention model that will be used for forecast the next 12 months and predict how long the effect of the intervention will last for avoid further losses due to the continued decline in the number of passengers. The intervention model is considered better for data that has intervention variable compared to SARIMA models. The results of forecasting showed that the SARIMA model (0,1,1)(1,1,1)12 b = 0, s = 8, r = 1 is the best model that can be used for forecasting data containing interventions. This is evidenced by the small MAPE of 36.34% so that the model is feasible to use because the accuracy is quite high and close to the actual value.
Naive Bayes Classifier Method on Sentiment Analysis of Bibit Application Users in Play Store Afifa Lufti Insani; Zamahsary Martha; Yenni Kurniawati; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/102

Abstract

The Bibit app is one of the most widely used investment apps these days. This application is widely used by novice investors because of its convenience in opening accounts, disbursing funds, purchasing mutual funds and easy-to-understand application design. However, there are still many people who doubt and worry about the quality of the Bibit application due to the lack of understanding of the advantages and disadvantages of the Bibit application. So, review data on the application is used which is available in the play store with the aim of knowing user reviews of the application and being a consideration for prospective users before using the application. Because reviews on the application have a large number and can be positive or negative, so sentiment analysis is needed that can help classify these reviews quickly. Then classification is carried out to obtain a classification model that can be used to predict user sentiment using the Naive Bayes Classifier method. The results obtained by Bibit application users tend to have positive sentiments with an accuracy value of 79.45%.
Fuzzy Geographically Weighted Clustering Analysis for Sectoral Potential Gross Regional Domestic Product in West Sumatera Syifa Nabilah Wandira; Zilrahmi; Syafriandi Syafriandi; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/109

Abstract

Gross Regional Domestic Product (GRDP) is the sum of the added value of all goods and services produced or produced in an area that arises as a result of various economic activities in a certain period. Each region certainly has its own advantages and potential, such as in sectors or business fields. GRDP inequality occurs due to differences in geographical conditions and natural resources in each region. The method that can be used to overcome this inequality is cluster analysis. Cluster analysis can group data objects that have the same characteristics so that the inequality that occurs can be seen from the clusters formed. Fuzzy Geographically Weighted Clustering is a clustering method using fuzzy logic which gives a geographic effect to each cluster so that it can better describe the actual cluster situation. The results of  research obtained 3 optimum clusters with different characteristics. Cluster 1 has high potential, cluster 2 has low potential and cluster 3 has medium potential in forming GRDP.
Categorical Data Clustering with K-Modes Method on Fire Cases in DKI Jakarta Province Widia Handa Riska; Dony Permana; Atus Amadi Putra; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/115

Abstract

In DKI Jakarta Province, the number of fires increases and decreases every year. For this reason, efforts need to be made to prevent and reduce the risk of fire. BPBD DKI Jakarta is responsible for this matter. However, for these efforts to be effective, information is needed regarding fire patterns that frequently occur. Fire patterns can be seen using K-Modes categorical clustering analysis. The data used is fire data in DKI Jakarta in 2018. The optimal number of clusters was obtained as 6 clusters based on the Davies Bouldin Index value with the smallest DBI value is 6,22. Of the six clusters, cluster 3 is the cluster with the highest number of fire cases. Cluster 3 has a centroid, namely that fire cases occurred on Friday, November, in Cakung District, due to an electrical short circuit, burning down residential houses and rarely causing minor injuries, serious injuries or deaths.
Implementation of Backpropagation Artificial Neural Network on Forecasting Export of Palm Oil in Indonesia Adinda Dwi Putri; Dina Fitria; Nonong Amalita; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 5 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss5/123

Abstract

Export activities are one of the largest revenues in Indonesia with the largest contributor to export is being palm oil. Increasing volume of palm oil exports, it will be able to spur economic growth in Indonesia. In this research, palm oil export forecasting in Indonesia is carried out based on the main destination countries using the Artificial Neural Network (ANN) method with the Backpropagation algorithm. The data used is palm oil export data for 2012-2022 obtained from the Central Statistics Agency (BPS) website. From the data used, the optimal architecture model is 10-1-3-3-1 with a MAPE of 9.68%, which means that this architecture uses 10 input data, 3 hidden layers with the number of each input neuron (1,3,3), and there is 1 output output. From this study, it is estimated that 90% of the results of palm oil export forecasting using the ANN method are close to the actual value.