Claim Missing Document
Check
Articles

Classification the Characteristics of Traffic Accident Victims in Pariaman Using the Chi-square Automatic Interaction Detection Algorithm Manja Danova Putri; Dina Fitria; Yenni Kurniawati; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/127

Abstract

Traffic accidents are incidents that occur when motor vehicles collide on the road, resulting in damage to vehicles and road infrastructure, as well as the potential for material losses, injuries, physical damage, and even death for those involved. Data from the Indonesian National Police show that the number of traffic accident victims between 2010 and 2020 ranged from 147.798 to 197.560 people, with fatalities predominantly occurring among individuals aged 15-34. The high number of traffic accident victims has negative impacts on various aspects of life, ranging from material losses to physical damage to the victims. Classification is a technique used to group objects or data into pre-defined classes or categories based on their attributes or features. One method in the field of classification is Chi-Square Automatic Interaction Detection (CHAID). The results of the classification using this method indicate that the age of the victims and the type of accident are the most significant variables influencing the condition of traffic accident victims. The evaluation of the model using a confusion matrix yielded an accuracy rate of 92%. This indicates that the model performs well in overall data classification.
Prediction of Palm Oil Production Results PT.KSI South Solok Using Ensemble k-Nearest Neighbor Nilda Yanti; Atus Amadi Putra; Dony Permana; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/136

Abstract

PT. KSI experienced production decrease that the cause of replanting that happened in 2022. In managing palm oil production PT. KSI has problems with palm oil production results not reaching out the targets so it can affect the Company's Work Plan and Budget, therefore it is very necessary to predict palm oil production results so that all palm oil production and processing activities can run according to plan. The ensemble technique is a method that is capable of making accurate predictions and is used very effectively in the kNN method, therefore there is no need to search for the best k value.Based on the results of the analysis that has been carried out, it can be seen that by using an ensemble the level of accuracy is 9.36%, which is considered high accuracy compared to just using a single kNN with k = 1 of 10.84%. So it can be concluded that the model has worked well with the data.
Implementation of an Artificial Neural Network Based on the Backpropagation Algorithm in Forecasting the Closing Price of the Jakarta Composite Index (IHSG) Aditya, Muhammad Fadhil Aditya; Zilrahmi; Yenni Kurniawati; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/137

Abstract

Investing is highly common in Indonesia. Continuous investment activities carried out by the community will increase economic activity and employment opportunities, increase national income, and increase the level of prosperity of the community. In carrying out share buying and selling transactions, there is a means for companies to obtain funds from official financiers or investors, which is called the capital market. One of the indices issued by the IDX is the Jakarta Composite Index (IHSG). Statistics can be used to help investors, the government, or related institutions to predict the value of the IHSG. One method that can be used to predict data is an Artificial Neural Network (ANN). Backpropagation method is a multi-layer ANN method that works in a supervised learning. The idea of the Backpropagation algorithm is that the input of the neural network is evaluated against the desired output results. The purpose of this research is to give forecasting values with high accuracy to describe the movement of IHSG close price values using the ANN method based on the Backpropagation algorithm. The research showed that the BP (4,6,1) model produced an RMSE value of 28,24024 and a MAPE value of 0.00342%. Based on the results of this research, an Artificial Neural Network model based on the Backpropagation Algorithm can be applied to predict the IHSG Closing Price value.
Klasifikasi Karies Gigi di Rumah Sakit Gigi dan Mulut Baiturrahmah Menggunakan Metode Random Forest Martia Rosada; Zilrahmi; Syafriandi Syafriandi; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/155

Abstract

The mouth cavity is the main gate through which germs and bacteria enter. Therefore, it is important to maintain oral hygiene. When dental and oral hygiene is not maintained it will cause dental and oral problems or diseases such as periodontitis, dental caries, tooth abscess, gingivitis and other dental and oral health problems. The dental and oral problems that many people experience are caries or cavities. West Sumatra itself has a fairly high prevalence of dental caries. Prevention of dental caries needs to be done by making the public aware of dental and oral hygiene in order to reduce the problem of dental caries in West Sumatra. Therefore, it is necessary to have a method that is able to classify dental caries based on its symptoms. The classification method is very useful for knowing the main factors that cause dental caries. One classification method that can be used is random forest. Random forest is an ensemble method, namely the development of several methods using bootstrap sampling. The results of this research use the smallest OOB level and the Variable Importance Measure (VIM). Random forest classification using dental and oral pain medical record data at Baiturrahmah Padang Hospital produces an OOB error rate of 32.08% or an accuracy rate of 67.92%. The optimal model is obtained using mtry=2 and ntree=200. From this research it can be concluded that dental plaque, age, and tooth brushing habits are the importance variables or main factors that influence dental caries.
Classification of Poor Households in West Sumatra Province using Decision Tree Algorithm C4.5 Dinda Fitriza; Atus Amadi Putra; Dodi Vionanda; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/157

Abstract

The significant and increasingly complex issue of poverty poses a considerable challenge to Indonesia's development, including West Sumatra Province, with a poverty rate was 5.92% in 2022. The government has initiated programs to address poverty by focusing on the criteria of impoverished households. Data on impoverished households can be obtained through the National Socio-Economic Survey (Susenas). One method that can classify impoverished households is the decision tree. Decision tree is a flowchart that resembles a tree. The C4.5 algorithm used in this research has the ability handle discrete and continuous data, manage variables with missing values, and prune decision tree branches. The result of the analysis shows that the variables affecting the classification of poor households are the number of household members, then the age of the household head, type of house floor, type of house wall, source of drinking water, and cooking fuel. The accuracy of the test data using a confusion matrix is 69.89%, sensitivity of 71.15% for classifying regular households, and specificity of 68.72% for classifying impoverished households.
K-Modes Analysis with Validation of the DBI in Grouping Provinces in Indonesia based on Indicators of Poor Households Syifa Azahra; Zilrahmi; Dodi Vionanda; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/165

Abstract

Poverty is the most pressing social problem in Indonesia. Efforts to alleviate poverty are to group provinces in Indonesia based on indicators of poor households using the K-modes algorithm. The data used is data from the 2017 Indonesian Demographic and Health Survey (IDHS) on the Household List. The analysis includes data noise detection, data clustering using K-Modes algorithm, and cluster validation with Davies Bouildin Index (DBI). Based on the clustering that has been done, two clusters are obtained, where cluster 1 consists of 26 provinces and cluster 2 consists of 8 provinces. cluster 1 is a cluster that fulfills 9 indicators of poor households and cluster 2 only a few indicators of poor households. So that the government can prioritize these 8 provinces to overcome poverty in Indonesia. For the DBI value obtained is 1.89 which means that 2 clusters are already well used in the algorithm.
Classification of Program Keluarga Harapan Recipient Households in Padang Using K-Nearest Neighbors Yurivo Rianda Saputra; Syafriandi Syafriandi; Dony Permana; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/167

Abstract

Program Keluarga Harapan (PKH) is a social assistance program from the government aimed at providing social protection in the central government's efforts to promote social welfareas. PKH provides benefits to poor families, especially pregnant women and children, by utilizing various health and education services available. PKH benefits also include people with disabilities and the elderly by maintaining their level of social welfare in accordance with the Constitution and the Nawacita of the Republic of Indonesia. The implementation of PKH that experiences distribution errors needs to be classified to ensure its proper distribution. Classification is performed by comparing the number of  neighbors (k) in K-Nearest Neighbors (KNN). The Synthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN) is applied to balance classes in the target classification and Recursive Feature Elimination Cross Validation (RFECV) is applied to select attributes in the dataset used. The data source was obtained from SUSENAS 2023 data in Padang City. The research results show that KNN with k = 3 is a good algorithm for classifying households recieiving PKH using 10 attributes. KNN with k = 3 achieves an Accuracy of 91,12%, Precision of 89,29%, and Recall of 96,77%.
Pengelompokan Potensi Kebakarn Hutan/Lahan di Indonesia Berdasarkan Sebaran Titik Panas Mengunakan Metode CLARANS fitri, silfia wisa; Martha, Zamahsary; Kurniawati, Yenni; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/182

Abstract

Kebakaran hutan/lahan merupakan bencana yang sering terjadi di beberapa negara di dunia. Peristiwa ini mendapat perhatian lebih dari pemerintah karena menimbulkan banyak kerugian seperti ekonomi, ekologi dan sosial. Indonesia merupakan negara dengan tingkat bencana kebakaran hutan/lahan yang tinggi, hal ini menjadikan Indonesia sebagai negara penyumbang pencemaran terbesar ketiga di dunia. Sehingga diperlukan upaya penanggulangan sejak dini, salah satu upaya yang dapat dilakukan adalah dengan memanfaatkan data titik api dengan melakukan klasifikasi wilayah yang berpotensi terjadinya kebakaran hutan/lahan. Kebakaran hutan/lahan ditandai dengan terdeteksinya data titik api oleh satelit yang terindikasi sebagai titik api. Pada penelitian ini parameter yang digunakan adalah lintang, bujur, kecerahan, keyakinan dan FRP (fire power radiative) dengan menerapkan metode CLARANS. CLARANS merupakan varian dari algoritma k-medoid dan juga merupakan pengembangan dari algoritma sebelumnya, seperti PAM dan CLARA untuk menangani jumlah data yang lebih besar dan tahan terhadap outlier. Hasil penelitian ini menunjukkan bahwa penggunaan metode CLARANS dapat digunakan untuk proses clustering data hotspot dengan hasil koefisien siluet sebesar 0,896 pada penggunaan 2 cluster dengan jumlah data sebanyak 12,287. Hasil cluster menunjukkan bahwa cluster 1 termasuk dalam potensi tinggi dengan kecerahan rata-rata 340K dengan kepercayaan rata-rata 95% dan cluster 2 termasuk dalam potensi sedang dengan kecerahan rata-rata 327 K.
Metode Density Based Spatial Clustering of Applications with Noise (DBSCAN) dalam Mengelompokkan Provinsi di Indonesia Berdasarkan Kasus Kriminalitas Tahun 2022 Miftahurrahmi, Syifa; Zilrahmi; Amalita, Nonong; Mukhti, Tessy Octavia
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/203

Abstract

Based on Central Statistics Agency 2023 data, in 2022 there was a significant increase in the number of crime cases in Indonesia compared to 2021, from 239,481 cases to 372,965 cases. The increase in the number of criminal acts occurred along with community activities that began to loosen up after the Covid-19 pandemic. The types of crimes that occur in Indonesia themselves vary, ranging from murder, theft, drug-related crimes, and others. This research will cluster provinces in Indonesia based on crime cases with certain types of crimes in 2022 using the Density Based Spatial Clustering of Applications with Noise (DBSCAN) method. The results of the study are expected to help the government and police in an effort to deal with crime in Indonesia. Clustering using the DBSCAN method produces 2 clusters with a silhouette coefficient value of 0,68. The resulting cluster is cluster 0 with noise category consisting of 5 provinces with a high number of crime cases, while cluster 1 consists of 29 provinces with a low number of crime cases.
Evaluasi Faktor-Faktor Yang Memengaruhi Indeks Pembangunan Manusia Tahun 2023 Menggunakan Metode SEM-PLS Putri, Sindy Amelia; Zilrahmi; Permana, Dony; Fitria, Dina
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/214

Abstract

The human development index (HDI) is a measure of the success of development in a country. Indonesia as a developing country in 2022 has an HDI value that ranks 112 out of a total of 193 countries in the world. This indicates that there is an urgent need for evaluation in increasing the HDI value in Indonesia which leads to an increase in the quality of human development. The evaluation can be done using the Structural Equation Modeling-Partial Least Square (SEM-PLS) analysis method. With 34 Indonesian provinces as observations, there are three dimensions as variables analyzed in this paper, namely economy, education, and health. These variables are analyzed based on each indicator variable. The results of the analysis show that in the economic variable, the influential indicators are the Open Unemployment Rate, GRDP per Capita at Constant Prices, and Average Wage per Hour Worker. Then in the education variable, the influential indicators are the School Participation Rate Age 7-12, the School Participation Rate Age 13-15, the Pure Enrollment Rate for Elementary/Middle School/Package A, the Pure Enrollment Rate for Junior High School/MTs/Package B, and the Pure Enrollment Rate for Senior High School/SMK/MA/Package C. Furthermore, in the health variable, there are indicators of the Percentage of Households by Province and Source of Adequate Drinking Water, and the Percentage of Ever-Married Women Aged 15-49 Years whose Last Childbirth Processed in a Health Facility which affect the value of HDI in Indonesia in 2023.