cover
Contact Name
Tessy Octavia Mukhti
Contact Email
tessyoctaviam@fmipa.unp.ac.id
Phone
+6282283838641
Journal Mail Official
tessyoctaviam@fmipa.unp.ac.id
Editorial Address
LPPM Universitas Negeri Padang, Jalan Prof. Dr. Hamka, Air Tawar Barat, Kota Padang, Sumatera Barat 25131
Location
Kota padang,
Sumatera barat
INDONESIA
UNP Journal of Statistics and Data Science
ISSN : -     EISSN : 2985475X     DOI : 10.24036/ujsds
UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its application. Articles can be in the form of research results, case studies, or literature reviews. All papers were reviewed by peer reviewers consisting of experts and academicians across universities.
Articles 18 Documents
Search results for , issue "Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science" : 18 Documents clear
The SMOTE Application of CART Methods for Coping Imbalanced Data in Classifying Status Work on Labor Force in the City of Padang Andini Yulianti; Fadhilah Fitri; Nonong Amalita; Dodi Vionanda
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/12

Abstract

Employment issues are one of the main concerns in every country, especially in developing countries including Indonesia. Employment problems faced by Indonesia are the lack of job opportunities, excess labor, and the uneven distribution of labor. This is because the growth of the labor force is higher than the growth of existing job opportunities, so that many workers do not get jobs which will cause unemployment. The city of Padang is the city that has the highest unemployment rate in West Sumatra from 2013 to 2021. The development of a smart city and identification of factors that influence unemployment is one of the efforts to reduce unemployment. This study uses the CART method to determine the factors that affect the number of the workforce in the city of Padang. The advantage of the CART method is that it is easy to interpret the results of the analysis, but the accuracy of the classification tree is low due to data imbalance. Therefore, this study uses the SMOTE method to overcome these problems. The optimal classification tree is formed from 8 terminal nodes and involves 4 explanatory variables consisting of marital status (X3), education level (X4), gender (X2) and age(X1), 5 terminal nodes which classify the labor force into the working category and 3 terminal nodes which classify the labor force into the unemployed category.
Self Organizing Maps Method for Grouping Provinces in Indonesia Based on the Landslide Impact Suwanda Risky; Syafriandi; Dony Permana; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/15

Abstract

Indonesia is a disaster-prone country due to its climatic, soil, hydrological, geological, and geomorphological conditions. A disaster is an event or chain of events that threatens and disrupts people's lives and livelihoods. A natural disaster is a disaster caused by an event or series of events caused by nature such as a landslide. The number of landslide disaster events in Indonesia varies from province to province, this is due to differences in the characteristics of each province in Indonesia. So that the impact caused by the landslide disaster is also different. Therefore, it is necessary to group and profile so that it can be known which province has the largest impact on landslide disasters. This study used the Self Organizing Maps method in a grouping. The number of clusters to be formed is 3 based on the optimal value of internal cluster validation (Dunn, Connectivity, and Silhouette Index). Cluster 1 consists of 31 provinces, and the average impact of landslides is small. In cluster 2 consisting of 2 provinces, there are 4 dominantly more significant impacts. Cluster 3 consisting of 1 province has 1 dominant impact greater. So it can be concluded that most provinces in Indonesia have a relatively small impact on landslide disasters. However, some provinces have a very large impact on landslides, namely the provinces of West Java, Central Java, and East Java.
Comparison of Haversine and Euclidean Distance Formula for Calculating Distance Between Regencies in West Sumatra Vinka Haura Nabilla; Indonesia; Dony Permana; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/39

Abstract

A distance is a number that indicates how far apart two place are. The benefits of using distance are widely used in research, one of which is in the application of spatial weighting matrices. The spatial weight matrix is obtained based on proximity information between regions. There are two types of spatial weights, namely, based on contiguity and distance. Determining the proximity of regions in West Sumatra is better to use spatial weighting based on distance because in West Sumatra there are islands and mountains that limit the regions. Some distance estimation equations that can be utilized are Haversine and Euclidean distance. The connection between the two points in Haversine takes into account the earth's curvature when calculating the distance, which is a difference between the two formulas. In contrast, the Euclidean distance method uses a straight line to connect two points. The purpose of this research is to ascertain whether the Haversine and Euclidean distance formulas produce significantly different results in terms of distance. Calculation of the coordinate point distance utilizes latitude and longitude obtained from Google Maps. The distances measured using both formulas were expressed as kilometers (km), then the data was processed using the z test. The findings demonstrated that the Haversine formula and the Euclidean distance formula did not significantly differ in the process of calculating distance.
Vector Error Correction Model for Cointegration Analysis of Factors Affecting Indonesia's Economic Growth during the Pandemic Period Rizqa Fajriaty Fitri MY; Dina Fitria; Syafriandi Syafriandi; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/40

Abstract

Stabel economic growth is the ultimate goal of monetary policy s seen from the stability of the rupiah. The economic situation has decreased due to the spread of Covid-19. In an effort to stabilize the economy, the relationship between factors supporting Indonesia's economic growth is analyzed using the VECM approach. This approach is able to determine the long-term and short-term relationships of time series data. The model results after fulfilling several tests are three significant equations. The model explains that there is an effect in the short term of the inflation and BI Rate variables on inflation as well as the inverse effect between BI-rate one period earlier on the exchange rate. The cointegration coefficient is negative, it indicates that there is a short-term to long-term adjustment mechanism that occurs in the inflation variable. The two cointegration equations for the long term show that for the long term, inflation can be positively influenced by the visa variable. Variable BI-rate in the long run is influenced by the variable exchange rate and visa. The VECM model can explain more than 50% of the variables.
Sentiment Analysis og Goride Services on Twitter Social Media Using Naive Bayes Algorithm Puti Utari Maharani; Nonong Amalita; Atus Amadi Putra; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/41

Abstract

Online motorcycle taxi is an application-based transportation technology innovation. Online motorcycles offer relatively low prices and offer discount features. However, the existence of online motorcycles creates congestion problems and conflicts between conventional transports. One such online motorcycle taxi service is GoRide. This GoRide feature is derived from the Gojek application. The emergence of GoRide raises public opinion and wants to judge an object openly through social media, one of which is Twitter. The assessment given by society is an analytical textual opinion. Sentiment analysis is used to detect opinions in the form of a person's judgment, evaluation, attitude, and emotion. The textual classification algorithm used in this study was Naive Bayes. This research aims to find out the public sentiment towards GoRide's service as an online motorcycle taxi in positive and negative categories and to find out the accuracy results of the Naive Bayes algorithm against GoRide's service. Research data was obtained using the API provided by Twitter developers. Analysis techniques are performed by text preprodeing, data labelling, word weighting, classification, then performance evaluation of classification. The results of the positive category sentiment classification are 698 data, while the negative category sentiment is 517 data. The Naive Bayes algorithm's performance evaluation results obtained an accuracy rate of 77.78%. So as a whole, GoRide can be categorized as a good service.  
Prediksi Harga Saham PT Bank Syariah Indonesia Tbk Menggunakan Support Vector Regression Isra Miraltamirus; Fadhilah Fitri; Dodi Vionanda; Dony Permana
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/43

Abstract

A company needs funding from outside the company so that all aspects of development needed can be fulfilled. Companies that need capital can carry out public offerings and sell securities on a stock exchange company. The movement of stock prices tends to fluctuate, so that it will have an impact on the income that will be received by companies and investors. This problem is currently happening to PT BSI Tbk, so it is necessary to do stock price modeling to predict the value of PT BSI Tbk's stock price in the coming days. Support vector regression is a machine learning method that can deal with fluctuating data by producing good predictive models. SVR aims to find the optimal hyperplane to produce a good predictive model. SVR uses the kernel function to handle non-linear data by mapping data from the input space to a higher feature space, hence it will be easier to form an optimal hyperplane. The kernel function used in this study is the radial basis function. The results of this study are that the best parameters are obtained with C = 100, ϵ = 0.01, and γ = 0.001 and produce a model error accuracy of 0.87%.
Comparison of Distance Function in K-Nearest Neighbor Algorithm to Predict Prospective Customers in Term Deposit Subscriptions Muhammad Tibri Syofyan; Nonong Amalita; Dodi Vionanda; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/47

Abstract

Data mining is often used to analysis of the big data to obtain new useful information that will be used in the future. One of the best algorithms in data mining is K-Nearest Neighbor (KKN). K-NN classifier is a distance-based classification algorithm. The distance function is a core component in measuring the distance or similarity between the tested data and the training data. Various measure of distance function exist make this a topic of kind literature problems to determining the best distance function for the performance of the K-NN classifier. This study aims to compare which distance function produces the best K-NN performance. The distance function to be compared is the Manhattan distance and Minkowski distance. The application of K-NN classifier using bank dataset about predict prospective customers in Term Deposit Subscriptions. This study show that Minkowski distance on K-NN algorithm achieved the best result compared to Manhattan distance. Minkowski distance with power p = 1.5 produces an accuracy rate of 88.40% when the K value is 7. Thus, performance of K-NN algorithm using Minkowski distance (p=1,5, K=7) is best algorithm in predicting prospective costumers in Term Deposit Subscription
Rainfall Forcasting in Medan City Using Singular Spectrum Analysis (SSA) Silvia Agustina; Fadhilah Fitri; Dodi Vionanda; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/52

Abstract

Singular spectrum analysis is a time series analysis that can be used for data that has seasonal effects. Rainfall is one example that has a seasonal effect. High rainfall has an impact on natural disasters such as floods. Medan city is the capital city of North Sumatra province which has quite high rainfall and is a lowland area, so it has the potential for flooding. Rainfall forecasting can be done as disaster mitigation. The forecasting method used is SSA. The MAPE forecasting accuracy value obtained is 15.5% and the tracking signal is within tolerance limits, so that it can be concluded that the forecasting is done well.
Classification for Covid-19 Affected Family Cash Aid Recipients Using Naïve Bayes Algorithm Mutiara Amazona Sosiawati; Syafriandi Syafriandi; Dony Permana; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/53

Abstract

The COVID-19 pandemic that occurred in Indonesia had a huge impact on the country's economy. One of the solutions set by the government in dealing with COVID-19 is to use APBD funds for social assistance in the form of cash, namely "Village Direct Cash Assistance" (BLT DD). With the hope that the people affected by COVID-19 can be helped by this assistance. There are several problems in the distribution of social assistance, one of which is recipients who are not on target. Therefore, it is necessary to use methods to correctly classify recipients. This study uses the Naïve Bayes method to classify people who receive and do not receive aid. From the results obtained on the confussion matrix, the people who received BLT DD assistance and were predicted to receive were as many as 33 people/KK, the people who did not receive BLT DD and were predicted not to receive as many as 34 people/KK, the people who received BLT DD and were predicted not to receive as many as 2 people/KK , and people who do not receive BLT DD and are predicted to receive as many as 6 people/families. As for the classification accuracy value obtained using the Naïve Bayes method is 89%, while the error rate obtained is 11%.
Modeling Human Development Index in Papua and West Sumatera with Multivariate Adaptive Regression Spline Yulia Pertiwi; Dony Permana; Nonong Amalita; Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol1-iss3/54

Abstract

The Human Development Index (HDI), is an indicator of the successful development of the quality of human life. The high value of HDI, shows the better development of a region. The purpose of this study is to model and determine the factors affect HDI in Papua Province and West Sumatera Province, using Multivariate Adaptive Regression Spline (MARS). MARS is one of the modeling methods that can handle high-dimensional data. The result of this study showed that the best MARS model for Papua Province is a combination of (BF=24, MI=2, and MO=0) with a minimum GCV value of 0.55953. while the best MARS model for West Sumatera Province is a combination of (BF=24, MI=2, and MO=0) with a minimum GCV value of 0.02697. Based on the model, the factors that significantly affect HDI in Papua Province and West Sumatera Province are average years of schooling (X2), adjusted per-capita income (X6), life expectancy (X1), percentage of poor people (X4), and gross regional domestic product (X3). The percentage level of importance of each variable for Papua Province is 100%, 45.26%, 29.24%, 6.55%, and 6.27%. Meanwhile, for West Sumatera Province it is 100%, 96.73%, 57.54%, 34.13%, and 29.6%, respectively. So in this case, based on the results of the study, the average years of schooling (X2) is the variable that most influences HDI in the two regions, with an importance level of 100%.  

Page 1 of 2 | Total Record : 18