Claim Missing Document
Check
Articles

Comparison Of Extreme Learning Machine And Holt Winter’s Exponential Smoothing Methods In Railway Passenger Forecasting Azma, Meil Sri Dian; Dony Permana; Fadhilah Fitri; Atus Amadi Putra
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/211

Abstract

Forecasting the number of passengers on the Pariaman Express train is an activity that is considered to have the potential to help PT KAI in maximizing passenger service facilities and comfort. It is estimated that the number of train passengers in Indonesia will always increase along with the increasing population of Indonesia. The high interest of users of this mode of transportation can be seen from historical data that continues to increase every year. PT KAI (Persero) as a single train transportation provider company needs to have several strategies in providing and meeting passenger needs every day. In the study of forecasting the number of passengers on the Pariaman Express train using the Holt Winters exponential smoothing method and one of the artificial neural network methods, namely the extreme learning machine. The purpose of this study was to determine the comparison of the accuracy values ​​of the forecast results produced by the two methods, and to find out which method is good to use in this forecast. The data used is data on the number of Pariaman Express train passengers from 2021-2023. The results of the study show that the comparison of the accuracy values ​​of the forecasting of the number of train passengers shows that the Holt Winter's and ELM methods have error values ​​above 10%, meaning that the Holt Winter's and ELM methods are good at forecasting for 4 periods. Holt Winter's has a MAPE value of 17.10% and ELM has a MAPE value of 20%.
Estimation of Poverty in North Sumatera in 2022 using Truncated and Penalized Spline Regression Kurnia Andrea Diva; Fadhilah Fitri; Dony Permana; Admi Salma
UNP Journal of Statistics and Data Science Vol. 2 No. 4 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss4/217

Abstract

The Sustainable Development Goals' main goal is to reduce poverty (SDGs). Low human capital is the cause of poverty. The Human Development Index is one indicator that can be used to assess human capital (HDI). Despite having the largest population on the island of Sumatra, North Sumatra continues to have the fifth highest poverty rate. Because the pattern of the relationship between poverty and HDI based on previous research is still unclear because the results are inconsistent, nonparametric regression modeling was used in this study because it is flexible in following the pattern of data relationships and can avoid model prespecific errors. This study aims to compare the Spline Truncated and Penalized Spline regression methods. The results of the comparison between the Truncated Spline regression model and the P-Spline regression model by looking at the smallest MSE value showed that a better estimator for modeling the Human Development Index in North Sumatera in 2022 is non-parametric regression using the truncated spline estimaor. where the best truncated spline modeling is at order 2 with one knot point located at X = 66.93 with a GCV value of 6.0543.
Implementation of CART Method with SMOTE for Household Poverty Classification in Mentawai Islands 2023 Dewi Adiningtiyas, Rheizma; Admi Salma; Syafriandi Syafriandi; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 2 No. 4 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss4/232

Abstract

Poverty is a condition in which individuals or groups are unable to fulfill their basic needs due to economic pressure or limited resources. The Classification and Regression Trees (CART) method is a classification technique in the form of a classification tree, which describes the relationship between independent and dependent variables. Data imbalance can lead to low sensitivity values and area under curve (AUC) values. One method that can overcome unbalanced data is to perform Synthetic Minority Oversampling Technique (SMOTE). SMOTE is a technique with the addition of artificial data in the minority class at a stage before analyzing the data. The purpose of this research is to compare the model without and with SMOTE in CART method. The use of SMOTE is applied to balance the amount of data on each poor household. The accuracy value of the method without SMOTE is 89% while with the SMOTE method is 79%. However, the sensitivity value has increased by 80%. Meanwhile, the AUC value in the CART method with SMOTE increased by 31%. So in this study it can be concluded that CART classification analysis with SMOTE is able to provide better performance compared to CART classification analysis without SMOTE.
Early Marriage Factors Indonesian Using Spatial Regression Analysis permana, yazid; Dina Fitria; Yenni Kurniawati; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 2 No. 4 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss4/239

Abstract

Marriage is a sacred union recognized socially and religiously to form a family, as regulated by Law No. 16 of 2019. The percentage of early marriages in Indonesia continues to rise, reaching 21.5% in 2022, placing Indonesia 8th in the world according to UNICEF 2023 data. The increase in early marriages has significant impacts on maternal and child health and often leads to high divorce rates, with 516,334 cases in 2022. The aim of this research is to provide information and knowledge for students about early marriage and spatial regression. The main factors influencing early marriages are low education levels, economic difficulties, and environmental factors. Research shows that early marriages are highest in Kalimantan and Sulawesi, with spatial effects influencing the percentage of early marriages between regions.Spatial regression analysis, such as the Spatial Autoregressive (SAR) model, is used to examine the interactions between regions affecting early marriage. Spatial autocorrelation tests and spatial dependency effects show a spatial dependency effect, making the SAR model with queen contiguity weights the most suitable. The resulting model is considered quite good considering the R-squared value of 40.97%. The best-formed model shows that the Open Unemployment Rate (TPT) of youth is a significant variable that greatly impacts the percentage of early marriages. Therefore, the central and provincial governments are expected to pay more attention to the open youth unemployment factor to control and reduce the rate of early marriages in Indonesia.
Error Correction Model Approach for Analysis of Original Regional Income in West Sumatra Herlena Purnama Sari; Fadhilah Fitri; Nonong Amalita; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 3 No. 1 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss1/332

Abstract

In this research, an error correction model approach is used, namely looking at long-term and short-termrelationships. Meanwhile, Original Regional Income (PAD) is all regional income originating from original regionaleconomic sources. Sources of Original Regional Income according to Law Number 33 of 2004 Chapter V Article 6consist of Regional Taxes, Regional Levies, Separated Regional Wealth Management Results and Other Legal PAD.because this approach uses long-term and short-term relationships, it is known that only variables x1 and x3 have along-term relationship and variables x1 and x3 have a short-term relationship. so it can be concluded that not allindependent variables have a connection with the dependent variable
Analisis Klaster K-means dalam mengelompokan Kabupaten/Kota di Provinsi Sumatera Barat Berdasarkan Jenis Kekerasan Terhadap Perempuan Tahun 2023 Febiola, Latifah Jayatri; Fadhilah Fitri; Fenni Kunia Mutiya
UNP Journal of Statistics and Data Science Vol. 3 No. 1 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss1/344

Abstract

Violence against women is a serious social issue and a violation of human rights. Women are often vulnerable to violence, whether physical, psychological, or sexual, which negatively impacts their physical and mental health. To understand the distribution of violence cases against women in West Sumatra Province, an analytical method is needed to classify regions based on the number of reported cases. K-Means Clustering is one of the clustering analysis methods used to group districts/cities based on similarities in the number of violence cases. This study aims to classify districts/cities in West Sumatra based on the number of female violence victims using the K-Means Clustering algorithm. The optimal number of clusters was determined using the silhouette method, resulting in three clusters. Cluster 3 has the highest average number of physical and sexual violence cases, consisting of four districts/cities: Solok Regency, Lima Puluh Kota, Solok City, and Payakumbuh City. Cluster 2 represents areas with a moderate level of violence, dominated by psychological abuse, and consists of five districts/cities. Meanwhile, Cluster 1 comprises ten districts/cities with the lowest recorded violence cases. This classification provides insight into the regional distribution of violence against women in West Sumatra, identifying areas that require more attention. The findings suggest that the government should prioritize regions with high levels of violence through stricter law enforcement, the provision of support services for victims, gender equality campaigns, and increased awareness of women's rights
Perbandingan metode Double Moving Average(DMA) dan Double Exponential Smoothing (Brown) Terhadap Tingkat Pengangguran Terbuka (TPT) di Kota Padang Panjang. Fishuri, Nufhika; Fadhilah Fitri; Dony Permana
UNP Journal of Statistics and Data Science Vol. 3 No. 2 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss2/366

Abstract

The Open Unemployment Rate (TPT) is the percentage of unemployed people in the total labor force. The population included in the labor force is the population aged 15 years and over who has a job but is temporarily not working. Unemployment occurs because of a mismatch between the demand for employment and the qualifications of job seekers. Many job vacancies require graduates with a diploma or degree, so unemployment is one of the problems faced by Padang Panjang City. To overcome TPT in Padang Panjang City, one of the needs is to do forecasting to see how the TPT rate will occur in the coming year. This research uses a forecasting method by comparing the Double Moving Average (DMA) and Double Exponential Smoothing (DES) forecasting values of the Unemployment Rate in Padang Panjang City from 2006 to 2023. This forecasting is done to provide insight into the future condition of the workforce in Padang Panjang City. The results of the forecasting indicate that in 2024, there will be an increase of 0.42%, and for the next 2 years, there will be a decrease
Nonparametric Regression with Local Polynomial Kernel on Relationship Between Schooling Years and Unemployment Rate in Banten Miftahul Barokah, Bunga; Fadhilah Fitri; Chairina Wirdiastuti
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss3/372

Abstract

The Open Unemployment Rate (TPT) is a key indicator in assessing the economic performance of Banten Province. One of the factors suspected to influence TPT is education, which is measured by the average years of schooling. This study aims to analyze the relationship between the average years of schooling and TPT using the Local Polynomial Kernel Nonparametric Regression method for the period 2017–2024. This method was chosen for its flexibility in modeling nonlinear relationships without requiring strict assumptions about the data. The optimal bandwidth parameter for smoothing was determined using the Direct Plug-In (DPI) method through the dpill function in the R software. The results show that the nonparametric model has a coefficient of determination (R²) of 0.2841, which is higher than that of the Ordinary Least Squares (OLS) linear regression model, which only reached 0.1710. This indicates that the nonparametric approach is better at capturing the complex relationship between education and unemployment. However, the low R² values in both models indicate the presence of other factors that influence the unemployment rate, such as economic conditions, labor market structure, and education policy. Therefore, increasing the average years of schooling alone may not be sufficient to significantly reduce the unemployment rate. More comprehensive policies are needed, such as job skill enhancement, vocational training, and economic strategies focused on job creation. The findings of this study are expected to provide useful insights for policymakers in formulating more effective strategies to address unemployment in Banten Province.
Forecasting Consumer Price Index in Personal Care Sector in Bukittinggi Using SVR with Grid Search and Radial Basis Function Kernel Pane, khairunnisa; Fadhilah Fitri; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss3/373

Abstract

Inflation, measured by the Consumer Price Index (CPI), is vital for economic stability and policy making. In Bukittinggi, the Personal Care and Other Services sector shows notable CPI fluctuations, complicating accurate forecasting. This study uses Support Vector Regression (SVR) to predict monthly CPI data for this sector from 2020 to 2024. Data from Statistics Indonesia was normalized with Min-Max normalization to improve model accuracy and avoid scale distortion. Lag features were added to capture time dependencies, and data was split into training (80%) and testing (20%) sets. A linear SVR model was first applied but showed limited success due to the data’s non-linear nature. Therefore, the Radial Basis Function (RBF) kernel was used, with hyperparameters (C, sigma, epsilon, folds) optimized via Grid Search and cross-validation. The optimal settings (C=32, sigma=2, epsilon=0.1, k=10) yielded the lowest RMSE of 0.1099 in cross-validation and 0.0767 on testing. Results demonstrate that the RBF-SVR model effectively captures non-linear CPI patterns and outperforms the linear model. Evaluation metrics included RMSE, MSE, and MAE. The study concludes that SVR combined with Grid Search offers a robust forecasting method for sectors with complex CPI behavior, supporting local economic planning in Bukittinggi. Future research could investigate hybrid models and larger datasets to enhance prediction accuracy and adaptability to market changes.
Comparison of Nadaraya-Watson Method with Local Polynomial in Modeling HDI and Poverty Relationship in Java Island Novi, Yoli Marda; Fadhilah Fitri; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 3 No. 3 (2025): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol3-iss3/380

Abstract

Poverty remains a critical issue in Indonesia, with the number of poor people reaching 24.06 million in September 2024. The Human Development Index (HDI), which indicates the level of human resource quality, is one of the factors influence poverty. This analysis focuses on the correlation involving HDI also this number of poor people in districts/cities in Java Island by comparing two kernel regresokesion methods, namely Nadaraya-Watson Estimator and Local Polynomial Estimator. Nonparametric regression was chosen thus it does not necessitate this presumption of a certain form of connection among variables, so it is more flexible in capturing complex relationship patterns. Secondary data from Statistics Indonesia (BPS) in 2024 was used in this study. Initial exploration shows, the data distribution does not have a clear pattern, so nonparametric methods are more suitable for use. Modeling is done using the optimal bandwidth obtained through the dpill function in R software. The analysis results show that the local polynomial estimator produces smoother regression curves and lower MSE values. In addition, comparison of different polynomial degrees shows that higher polynomial degrees tended to improve model performance. Among the tested polynomial degrees, the local polynomial with degree five (p=5) produced the lowest MSE value and the highest coefficient of determination. Therefore, the local polynomial estimator with degree 5 is the best method for modeling the relationship between the HDI and poverty levels in Java in 2024