Claim Missing Document
Check
Articles

Fuzzy K-Nearest Neighbor to Predict Rainfall in Padang Pariaman District Annisa Rizki Amalia; Nonong Amalita; Yenni Kurniawati; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/126

Abstract

Information about rainfall levels at a time and in a region is very important because rainfall influences human activities. Rainfall is the amount of water that falls to the earth in a certain period of time, measured in millimeters. One piece of information related to rainfall is daily rainfall predictions. In this study, an attempt was made to classify daily rainfall at the Padang Pariaman climatology station into 5 categories, namely very light rain, light rain, moderate rain, heavy rain and very heavy rain. There are 4 weather parameters used, namely air temperature, humidity, wind speed and duration of sunlight. One of the methods used to predict rainfall is data mining, a computer learning to analyze data automatically thus obtaining a perfect new model. One of the best prediction algorithms in data mining is Fuzzy K-Nearest Neighbor (FK-NN). FK-NN uses the largest membership degree value of the test data in each class to predict the class. The number of sample classes for rainfall data in Padang Pariaman Regency has an imbalance class. To overcome the imbalance class, Synthetic Minority Over-sampling Technique (SMOTE) method is used to generate minority data as much as majority data. The results of this study by using FK-NN classification with 343 test data, parameters K = 12, and euclidean distance is quite good at the accuracy level of 76,38%..
Classification the Characteristics of Traffic Accident Victims in Pariaman Using the Chi-square Automatic Interaction Detection Algorithm Manja Danova Putri; Dina Fitria; Yenni Kurniawati; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/127

Abstract

Traffic accidents are incidents that occur when motor vehicles collide on the road, resulting in damage to vehicles and road infrastructure, as well as the potential for material losses, injuries, physical damage, and even death for those involved. Data from the Indonesian National Police show that the number of traffic accident victims between 2010 and 2020 ranged from 147.798 to 197.560 people, with fatalities predominantly occurring among individuals aged 15-34. The high number of traffic accident victims has negative impacts on various aspects of life, ranging from material losses to physical damage to the victims. Classification is a technique used to group objects or data into pre-defined classes or categories based on their attributes or features. One method in the field of classification is Chi-Square Automatic Interaction Detection (CHAID). The results of the classification using this method indicate that the age of the victims and the type of accident are the most significant variables influencing the condition of traffic accident victims. The evaluation of the model using a confusion matrix yielded an accuracy rate of 92%. This indicates that the model performs well in overall data classification.
Implementation of an Artificial Neural Network Based on the Backpropagation Algorithm in Forecasting the Closing Price of the Jakarta Composite Index (IHSG) Muhammad Fadhil Aditya Aditya; Zilrahmi; Yenni Kurniawati; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/137

Abstract

Investing is highly common in Indonesia. Continuous investment activities carried out by the community will increase economic activity and employment opportunities, increase national income, and increase the level of prosperity of the community. In carrying out share buying and selling transactions, there is a means for companies to obtain funds from official financiers or investors, which is called the capital market. One of the indices issued by the IDX is the Jakarta Composite Index (IHSG). Statistics can be used to help investors, the government, or related institutions to predict the value of the IHSG. One method that can be used to predict data is an Artificial Neural Network (ANN). Backpropagation method is a multi-layer ANN method that works in a supervised learning. The idea of the Backpropagation algorithm is that the input of the neural network is evaluated against the desired output results. The purpose of this research is to give forecasting values with high accuracy to describe the movement of IHSG close price values using the ANN method based on the Backpropagation algorithm. The research showed that the BP (4,6,1) model produced an RMSE value of 28,24024 and a MAPE value of 0.00342%. Based on the results of this research, an Artificial Neural Network model based on the Backpropagation Algorithm can be applied to predict the IHSG Closing Price value.
Sentiment Analysis of DANA Application Reviews on Google Play Store Using Naïve Bayes Classifier Algorithm Based on Information Gain Cindy Caterine Yolanda; Syafriandi Syafriandi; Yenni Kurniawati; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/147

Abstract

DANA is a digital payment platform that provides various features to make it easier for users to make payments, transfers, and balance replenishment online. DANA application users provide a variety of reviews that include both constructive and critical opinions, which can be valuable input for DANA application developers. The purpose of this research is to evaluate the results of sentiment classification of DANA application user reviews on the Google Play Store service using the Naïve Bayes Classifier method and Information Gain feature selection. In addition, this study aims to assess the effect of applying IG feature selection on the performance of the resulting model. In this study, reviews are divided into two categories, namely positive and negative based on lexicon-based labeling. Furthermore, data weighting, feature selection, and data division are carried out with a proportion of 80% train data and 20% test data before model building. There are two models, namely a model without feature selection (NBC model) and a model with feature selection (NBC-IG model). The evaluation results showed that the NBC model with 1106 features performed well, with 82.91% accuracy, 83.96% precision, and 90.23% recall. Meanwhile, the NBC-IG model with 536 features showed higher performance, with 85.09% accuracy, 85.79% precision, and 92.09% recall. The application of IG feature selection with the IG value limit parameter > 0.01 in the NBC model successfully reduced the number of features by 570, and improved model performance with an increase in accuracy by 2.18%, precision by 1.83%, and recall by 1.86%.
Artificial Neural Network Model for Estimating the Poor Population in Indonesia as an Effort to Alleviate Poverty Febi Febiola Putri; Atus Amadi Putra; Yenni Kurniawati; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/154

Abstract

Forecasting the poverty rate in Indonesia is one of the activities that is considered to be able to help various parties, such as being able to help the government in planning more effective and efficient poverty alleviation programs. In this study, forecasting the poverty rate in Indonesia was carried out using the backpropagation artificial neural network method. The purpose of this research is to model and predict the poverty rate using the backpropagation artificial neural network model, and to determine the accuracy of the forecasting results produced by this method. This research is an applied researc. The data used is annual data on proverty in Indonesia from 2917-2021. The data is then divided into two parts, namely training data and test data. The results show that the best artificial network model is BP (7,7,2) with 7 neurons in the input layer, 7 neurons in the hidden layer, and 2 neurons in the output layer. The accuracy of this model is good with a MAPE value of 0.07633%. The forecasting results in the next period show that the highest number of poor people is East Java province with a value of 3604.1698 thousand people in the first semester (March) of 2022 and has increased in the second semester period (September) of 2022 with a value of 3698.822 thousand people
Perbandingan Algoritma C4.5 dan C5.0 Dalam Klasifikasi Status Gizi Balita Stunting dhea afrila harelvi; Admi Salma; Yenni Kurniawati; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/172

Abstract

Stunting is one of the health conditions that reflect aspects of nutrition and child growth, allowing us to observe the nutritional status of toddlers. The aim of this study is to determine the classification results of the C4.5 and C5.0 algorithms in cases of stunted toddler nutritional status and to compare the results between the C4.5 and C5.0 algorithms in classifying stunted toddler nutritional status using k-fold cross-validation. The data in this study are secondary data. Which is collected from Puskesmas IV Pesisir Selatan Regency. The research variables are divided into two, namely the response variable Y, which is Toddler Nutritional Status, and predictor variables X including Age, Toddler Gender, Toddler Weight, and Toddler Height. The result of the study obtain the algorithm C5.0 produse accuracy value of the C5.0 algorithm is higher than that of the C4.5 algorithm. The C5.0 algorithm provides an average accuracy result of 83% while the C4.5 algorithm provides an accuracy result of 79%. Thus, it can be concluded that the C5.0 algorithm is better at classifying stunted toddler nutritional status.
ISLAMIC INTEGRATED SCIENCE LEARNING IN JUNIOR HIGH SCHOOL Abdullah Herman; Fachri Dermawan; Fatimah Depi Susanty Harahap; Yenni Kurniawati
Jurnal Pembelajaran Sains Vol 7, No 2 (2023)
Publisher : Prodi Pendidikan IPA FMIPA Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um033v7i2p72-78

Abstract

This article's goal is to present the goals of science education at the high school level that are incorporated with Islamic principles. The study identifies some of the learning objectives that are prioritized in the context of integrating science and Islam. Understanding scientific ideas within the context of Islam, developing moral and ethical principles, comprehending the connection between science and religion, and fortifying one's Islamic identity are the main objectives of education. The findings demonstrate that the integrated approach allows students to deepen their grasp of science and increase their awareness of Islam.
Classification of Harvest - Non Harvest in Rice Plant Image Using Convolutional Neural Network Algorithm Revina Rahmadani; Yenni Kurniawati; Dony Permana; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/181

Abstract

The Area Sample Framework (ASF) survey is an area based survey carried out by direct observation of sample parts whose locations have been determined. Every month ASF officers take photos of observation results using an Android based cellphone, where the results of the photos will be classified manually by supervision officers and sent to a central server for processing. The large amount of rice plant image data included can hinder officers in classifying rice growth phases. Therefore, to speed up the classification process, the Convolution Neural Network (CNN) method is used. In this research, the CNN model built consists of 3 convolution layers, 3 pooling, ReLU and Sigmoid activation functions, with several other parameters such as batch size and epoch value. The training results show that the accuracy value for the training data is 92.86% with an epoch value of 120. Meanwhile, the accuracy value for the validation data is 69.01%. Model evaluation shows a precision value of 21.34% and a recall value of 32.20%. This shows that the CNN model has poor performance in predicting harvest and non-harvest in rice plant images.
Pengelompokan Potensi Kebakarn Hutan/Lahan di Indonesia Berdasarkan Sebaran Titik Panas Mengunakan Metode CLARANS fitri, silfia wisa; Martha, Zamahsary; Kurniawati, Yenni; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/182

Abstract

Kebakaran hutan/lahan merupakan bencana yang sering terjadi di beberapa negara di dunia. Peristiwa ini mendapat perhatian lebih dari pemerintah karena menimbulkan banyak kerugian seperti ekonomi, ekologi dan sosial. Indonesia merupakan negara dengan tingkat bencana kebakaran hutan/lahan yang tinggi, hal ini menjadikan Indonesia sebagai negara penyumbang pencemaran terbesar ketiga di dunia. Sehingga diperlukan upaya penanggulangan sejak dini, salah satu upaya yang dapat dilakukan adalah dengan memanfaatkan data titik api dengan melakukan klasifikasi wilayah yang berpotensi terjadinya kebakaran hutan/lahan. Kebakaran hutan/lahan ditandai dengan terdeteksinya data titik api oleh satelit yang terindikasi sebagai titik api. Pada penelitian ini parameter yang digunakan adalah lintang, bujur, kecerahan, keyakinan dan FRP (fire power radiative) dengan menerapkan metode CLARANS. CLARANS merupakan varian dari algoritma k-medoid dan juga merupakan pengembangan dari algoritma sebelumnya, seperti PAM dan CLARA untuk menangani jumlah data yang lebih besar dan tahan terhadap outlier. Hasil penelitian ini menunjukkan bahwa penggunaan metode CLARANS dapat digunakan untuk proses clustering data hotspot dengan hasil koefisien siluet sebesar 0,896 pada penggunaan 2 cluster dengan jumlah data sebanyak 12,287. Hasil cluster menunjukkan bahwa cluster 1 termasuk dalam potensi tinggi dengan kecerahan rata-rata 340K dengan kepercayaan rata-rata 95% dan cluster 2 termasuk dalam potensi sedang dengan kecerahan rata-rata 327 K.
Classification of Dropout Rates in West Sumatra Using the Random Forest Algorithm with Synthetic Minority Oversampling Technique Anita Fadila; Syafriandi Syafriandi; Yenni Kurniawati; Admi Salma
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/183

Abstract

This study aims to classify school dropout rates in West Sumatra Province using the Random Forest algorithm with the Synthetic Minority Oversampling Technique (SMOTE). Based on 2021 data from the Ministry of Education, Culture, Research, and Technology (Kemdikbudristek), the dropout rate in West Sumatra is above the national average. Despite efforts to reduce dropout rates, results remain suboptimal. Therefore, this study seeks to identify the causes of student dropouts and compare the performance of the Random Forest algorithm with and without SMOTE. The study uses the 2021 dropout data from West Sumatra, which has a significant class imbalance. SMOTE is applied to balance the data. The dataset is split into training and testing sets in an 80%:20% ratio, and parameter tuning is performed to optimize mtry and the number of trees (ntree). The model is evaluated using a confusion matrix to compare performance. The results show that Random Forest with SMOTE outperforms the version without SMOTE, with improvements in precision, recall, and F1-score. The presence of the biological mother ( ) is identified as the most significant factor influencing student dropouts, based on the Mean Decrease Gini value. The study concludes that using SMOTE in the Random Forest algorithm helps reduce classification bias and enhances the model's ability to detect students at risk of dropping out.
Co-Authors Abdullah Herman Admi Salma Afifa Lufti Insani Ahmad, Nur Jahan AL Rezki Ivansyah Alya Aufa, Wafiq Amelia Susrifalah Anang Kurnia Anggara, Rudi Anggi Adrian Danis Anita Fadila Annisa Ramadhani Annisa Rizki Amalia Aprotama, Celsy Ardhi, Sonia Ardiyatul Putri Arnellis Arnellis arrahmi, nailul Atus Amadi Putra Aulia, Yuke Aurumnisva Faturrahmi Berliana Nofriadi Bimbim Oktaviandi Celsy Aprotama Chairina Wirdiastuti Cindy Caterine Yolanda Darwas Deska Warita Devi Yopita Sipayung Dewi Murni Dewi, Sari Tirta dhea afrila harelvi Dina Fitria Dina Fitria Dina Fitria, Dina Disti Harlin Diva Diva Aliyah Diyanti, Wafika Rahma Djamaluddin, Safrijal Dodi Vionanda Dony Permana Dwi Sulistiowati, Dwi Elfiani Sarian Bur Elfin Innaka Hamidah Elza Vinora Fachri Dermawan Fadhil Irsyad, Muhammad Fadhilah Fitri Fadzliana, Nanda Fahmi Amri, Fahmi Fashihullisan Fatimah Depi Susanty Harahap Fayyadh Ghaly Fayza Annisa Febrianti Febi Febiola Putri Fitri, Fadhilah Fitri, Fitri Hayati fitri, silfia wisa Ghaly, Fayyadh Hadiyanti Riskha Handayani, Laras Dyaz Harpidna, Riska Harpidna Hary Merdeka Helma Helma Helma Helma Hendrawan, Muhammad Hendri, Jhon Ihsan Dermawan Irwan Irwan Khairani, Putri Rahmatun Kusman Sadik Lina, Ejma Rukma Lutfian Almash M Fathoni Arnas Manja Danova Putri Marvero, Andre Maya Ifra Shobia Meira Parma Dewi Minora Longgom Nasution Muhammad Arief Rivano Muhammad Fadhil Aditya Aditya Mujakir Mujakir Mukhti, Tessy Octavia Mulyani, Suci NA Mentacem Nabillah, Marwana Natasya Dwi Ovalingga, natasyalinggaa Nonong Amalita Oktaviani, Bernadita Permana, Dony permana, yazid Prida Nova Sari Putra, Dio Afdal Putri Amalia Azzahra Putri Yeni, Dicha Putri, Fadhira Vitasha Putri, Rihani Himtari Rahma, Dzakyyah rahmad revi fadillah Rahmah, Ati Rahmawati, Santri Ramadani, Dea refelita, fitri Revina Rahmadani Riady, AD Rizkiah, Niswatul Ronald Rinaldo Rosa Salsabila Azarine Rosya, Aljeneri Safitri, Natasya S. Salma, Admi Sari, Ceria Purnama Sari, Nurhikmah Sasmita, Riza Sepniza Nasywa Septrina Kiki Arisandi Siregar, Erlina Azmi Siskha Maulana Basrul SRI RAHAYU Sri Wahyuni Suci Rahmadani Susrifalah, Amelia Syafriandi Syafriandi Syafriandi Syafriandi Tessy Octavia Mukhti Tsani, Nahda Maesya Wimmi Sartika Windi Dwi Saputra yenti, elvi Yunistika Ilanda Zamahsary Martha Zilrahmi, Zilrahmi