Articles
Modeling Open Unemployment Rate in West Sumatera Province Using Truncated Spline Regression
Aprilla Suhada;
Syafriandi;
Dodi Vionanda;
Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 1 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (937.841 KB)
|
DOI: 10.24036/ujsds/vol1-iss1/3
The Open Unemployment Rate (TPT) is an indicator used to measure the unemployment rate in the labor force which shows the percentage of the number of job seekers to the total workforce. In 2020 West Sumatra Province occupies the eighth position as the largest contributor to unemployment in Indonesia, this is a problem for the West Sumatra Provincial government. To deal with the unemployment problem, it is necessary to analyze the factors that are thought to affect the open unemployment rate in West Sumatra Province using truncated spline regression on the grounds that the data pattern between the response variables and each predictor variable does not form any pattern. Several factors are thought to influence the open unemployment rate, namely population, labor force participation rate, average length of schooling, dependency ratio. Based on the results of the analysis, the best model for modeling the open unemployment rate in West Sumatra Province is the truncated spline regression using three knot points with a GCV value of 0.061762. Variables that have a significant effect are population, labor force participation rate, average length of schooling and dependency ratio with a coefficient of determination of 99.97%.
Forecasting Shallot Prices in West Sumatra Province Using the Fuzzy Time Series Method of the Singh Model and the Cheng Model
Huriati Khaira;
Fadhilah Fitri;
Nonong Amalita;
Dony Permana
UNP Journal of Statistics and Data Science Vol. 1 No. 1 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (807.7 KB)
|
DOI: 10.24036/ujsds/vol1-iss1/7
Shallots are one of the leading spices that are widely used by humans as food seasoning and traditional medicine. The price of shallots always fluctuates which can affect the buying and selling of consumers and producers. Therefore, forecasting is used as a reference to be able to predict the price of shallots in the future and can provide convenience to the public for the condition of shallot prices in the next period. The forecasting method used is the fuzzy time series (FTS) method. FTS is a method whose forecasting uses data in the form of fuzzy sets sourced from real numbers to the universe set on actual data. Forecasting models used in this study are Singh's FTS model and Cheng's model. The data used is monthly data on shallot prices in West Sumatra Province for the period January 2018 to March 2022. The results obtained in this forecast are the Singh model FTS has a smaller MAPE value of 4.41% with a forecasting accuracy value of 95.59 %. This means that Singh's FTS model is better at predicting the price of shallots in West Sumatra Province.
Multivariate Adaptive Regression Spline Method for Study Timeliness of the 2017 FMIPA UNP Student
Rahmadani Iswat;
Fadhilah Fitri;
Atus Amadi Putra;
Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (1120.783 KB)
|
DOI: 10.24036/ujsds/vol1-iss2/23
The punctuality of study is the time period to complete an education, for undergraduate students is 4 years. One of the quality’s determining of higher education is students’ ability to complete their education on time. The purpose of this study is to see the best modeling results and the accuracy of the punctuality of study of class 2017 FMIPA UNP undergraduate students using MARS. MARS is a method of multivariate nonparametric regression between response variables and predictor variables. The type of research used is applied research. The predictor variables used in this study are Grade Point Average (GPA), gender, university entrance, major, school origin status and place of origin. While the response variable is punctuality of learning time. The results of trial and error showed that the best model was obtained from a combination (BF = 18, MI = 3 and MO = 2), with a minimum GCV value of 0.23182 and R2 value of 0.10045. From the model, it can be seen that the factors that significantly affect punctuality of learning time for FMIPA UNP students class 2017 are the X4 (majors) with an importance level of 100%, the X1 (GPA) with an importance level of 96.61%, X3 (university entrance) and the X5 (school origin status) with an importance level of 16.78 %. The classification accuracy on the 2017 student study timeliness is 64% based on graduating on time and not on time, with a classification error rate of 36%.
Comparison of Naive Bayes Method and Binary Logistics Regression on Classification of Social Assistance Recipients Program Keluarga Harapan (PKH)
Fanni Rahma Sari;
Fadhilah Fitri;
Atus Amadi Putra;
Dony Permana
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (1276.163 KB)
|
DOI: 10.24036/ujsds/vol1-iss2/24
Population density is one of the causes of economic inequality in society. One of the solutions provided by the government is to distribute social assistance. In 2007 the government created a social assistance program called the “Program Keluarga Harapan” (PKH) with the aim of alleviating poverty. There are several problems in the distribution of social assistance, one of which is receiving aid that is not right on target. Therefore, an appropriate method is needed in classifying the recipients of social assistance properly. This study will use two methods, namely Naive Bayes and Binary Logistic Regression to compare which method is better on the data used. The data used is the DTKS data for PKH assistance recipients in the Anduring Village in 2020. Based on the results obtained, the accuracy of the Naive Bayes method is 70% and Binary Logistic Regression is 73%. So the best method in measuring classification is Binary Logistic Regression.
Nonparametric Regression Modeling with Fourier Series Approach on Poverty Cases in West Sumatra Province
Melin Wanike Ketrin;
Fadhilah Fitri;
Atus Amadi putra;
Zilrahmi
UNP Journal of Statistics and Data Science Vol. 1 No. 2 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (1221.941 KB)
|
DOI: 10.24036/ujsds/vol1-iss2/32
Poverty is a complex problem that has an impact on various social problems such as education, unemployment, health and economic growth. Therefore, the problem of poverty is important to overcome in order to create population welfare. One of the analyses that can be used to model the percentage of poverty is regression analysis. Regression analysis is divided into two approaches, namely parametric and nonparametric. Parametric regression has several assumptions while, the only assumption nonparametric regression shape of the curve does not form a certain pattern. There are several approaches to nonparametric regression, one of which is the Fourier Series. The purpose of this study is to model the percentage of poverty in West Sumatra Province. The unclear shape of the curve in the data used is a consideration for using nonparametric regression. Then it is known that the data used in this study is data per region which tends to have a fluctuating nature. So it is suitable to use the Fourier series approach. In this research, nonparametric regression modeling with one, two, and three oscillation parameters was attempted. The best model was obtained which consisted of two oscillation parameters with a Generalized Cross Validation (GCV) value of 2.110 and R² of 92.44%.
The SMOTE Application of CART Methods for Coping Imbalanced Data in Classifying Status Work on Labor Force in the City of Padang
Andini Yulianti;
Fadhilah Fitri;
Nonong Amalita;
Dodi Vionanda
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol1-iss3/12
Employment issues are one of the main concerns in every country, especially in developing countries including Indonesia. Employment problems faced by Indonesia are the lack of job opportunities, excess labor, and the uneven distribution of labor. This is because the growth of the labor force is higher than the growth of existing job opportunities, so that many workers do not get jobs which will cause unemployment. The city of Padang is the city that has the highest unemployment rate in West Sumatra from 2013 to 2021. The development of a smart city and identification of factors that influence unemployment is one of the efforts to reduce unemployment. This study uses the CART method to determine the factors that affect the number of the workforce in the city of Padang. The advantage of the CART method is that it is easy to interpret the results of the analysis, but the accuracy of the classification tree is low due to data imbalance. Therefore, this study uses the SMOTE method to overcome these problems. The optimal classification tree is formed from 8 terminal nodes and involves 4 explanatory variables consisting of marital status (X3), education level (X4), gender (X2) and age(X1), 5 terminal nodes which classify the labor force into the working category and 3 terminal nodes which classify the labor force into the unemployed category.
Comparison of Haversine and Euclidean Distance Formula for Calculating Distance Between Regencies in West Sumatra
Vinka Haura Nabilla;
Indonesia;
Dony Permana;
Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol1-iss3/39
A distance is a number that indicates how far apart two place are. The benefits of using distance are widely used in research, one of which is in the application of spatial weighting matrices. The spatial weight matrix is obtained based on proximity information between regions. There are two types of spatial weights, namely, based on contiguity and distance. Determining the proximity of regions in West Sumatra is better to use spatial weighting based on distance because in West Sumatra there are islands and mountains that limit the regions. Some distance estimation equations that can be utilized are Haversine and Euclidean distance. The connection between the two points in Haversine takes into account the earth's curvature when calculating the distance, which is a difference between the two formulas. In contrast, the Euclidean distance method uses a straight line to connect two points. The purpose of this research is to ascertain whether the Haversine and Euclidean distance formulas produce significantly different results in terms of distance. Calculation of the coordinate point distance utilizes latitude and longitude obtained from Google Maps. The distances measured using both formulas were expressed as kilometers (km), then the data was processed using the z test. The findings demonstrated that the Haversine formula and the Euclidean distance formula did not significantly differ in the process of calculating distance.
Sentiment Analysis og Goride Services on Twitter Social Media Using Naive Bayes Algorithm
Puti Utari Maharani;
Nonong Amalita;
Atus Amadi Putra;
Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol1-iss3/41
Online motorcycle taxi is an application-based transportation technology innovation. Online motorcycles offer relatively low prices and offer discount features. However, the existence of online motorcycles creates congestion problems and conflicts between conventional transports. One such online motorcycle taxi service is GoRide. This GoRide feature is derived from the Gojek application. The emergence of GoRide raises public opinion and wants to judge an object openly through social media, one of which is Twitter. The assessment given by society is an analytical textual opinion. Sentiment analysis is used to detect opinions in the form of a person's judgment, evaluation, attitude, and emotion. The textual classification algorithm used in this study was Naive Bayes. This research aims to find out the public sentiment towards GoRide's service as an online motorcycle taxi in positive and negative categories and to find out the accuracy results of the Naive Bayes algorithm against GoRide's service. Research data was obtained using the API provided by Twitter developers. Analysis techniques are performed by text preprodeing, data labelling, word weighting, classification, then performance evaluation of classification. The results of the positive category sentiment classification are 698 data, while the negative category sentiment is 517 data. The Naive Bayes algorithm's performance evaluation results obtained an accuracy rate of 77.78%. So as a whole, GoRide can be categorized as a good service.
Prediksi Harga Saham PT Bank Syariah Indonesia Tbk Menggunakan Support Vector Regression
Isra Miraltamirus;
Fadhilah Fitri;
Dodi Vionanda;
Dony Permana
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol1-iss3/43
A company needs funding from outside the company so that all aspects of development needed can be fulfilled. Companies that need capital can carry out public offerings and sell securities on a stock exchange company. The movement of stock prices tends to fluctuate, so that it will have an impact on the income that will be received by companies and investors. This problem is currently happening to PT BSI Tbk, so it is necessary to do stock price modeling to predict the value of PT BSI Tbk's stock price in the coming days. Support vector regression is a machine learning method that can deal with fluctuating data by producing good predictive models. SVR aims to find the optimal hyperplane to produce a good predictive model. SVR uses the kernel function to handle non-linear data by mapping data from the input space to a higher feature space, hence it will be easier to form an optimal hyperplane. The kernel function used in this study is the radial basis function. The results of this study are that the best parameters are obtained with C = 100, ϵ = 0.01, and γ = 0.001 and produce a model error accuracy of 0.87%.
Rainfall Forcasting in Medan City Using Singular Spectrum Analysis (SSA)
Silvia Agustina;
Fadhilah Fitri;
Dodi Vionanda;
Admi Salma
UNP Journal of Statistics and Data Science Vol. 1 No. 3 (2023): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24036/ujsds/vol1-iss3/52
Singular spectrum analysis is a time series analysis that can be used for data that has seasonal effects. Rainfall is one example that has a seasonal effect. High rainfall has an impact on natural disasters such as floods. Medan city is the capital city of North Sumatra province which has quite high rainfall and is a lowland area, so it has the potential for flooding. Rainfall forecasting can be done as disaster mitigation. The forecasting method used is SSA. The MAPE forecasting accuracy value obtained is 15.5% and the tracking signal is within tolerance limits, so that it can be concluded that the forecasting is done well.