Claim Missing Document
Check
Articles

Found 6 Documents
Search
Journal : Building of Informatics, Technology and Science

Implementasi Algoritma Gaussian Naïve Bayes Dalam Klasifikasi Status Gizi Pada Balita Kurniawan, Hery; Rahim, Abdul; Siswa, Taghfirul Azhima Yoga
Building of Informatics, Technology and Science (BITS) Vol 6 No 2 (2024): September 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i2.5493

Abstract

Nutritional status is a condition related to nutrition that can be measured and results from the balance between the body's nutritional needs and nutrient intake from food. In Indonesia, nutritional problems such as malnutrition and other nutritional issues are still prevalent. In this context, the use of machine learning (ML) and data mining (DM) techniques and tools can be very helpful in tackling challenges in the manufacturing sector. Therefore, this study will use the Naïve Bayes Classifier algorithm with a Gaussian model. The data used is the nutritional status data of toddlers from January to July 2023 in Samarinda City. The attributes in this study include Gender, Birth Weight, Birth Height, Age at Measurement, Body Weight, Body Height, ZS BW/A, BW/A, ZS BH/A, and BH/A. The determination of toddlers' nutritional status in this study is based on the BW/BH index, which consists of 6 classes: severe malnutrition, undernutrition, good nutrition, risk of overnutrition, overnutrition, and obesity. From the study conducted, it was found that the Naïve Bayes Classifier algorithm with the Gaussian model can accurately classify toddlers' nutritional status. From the data processing performed, it was found that the accuracy value of the Gaussian model is 81.85%.
Penerapan Metode GA-TL Pada Algoritma Naive Bayes Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Widyastuti, Dessy; Siswa, Taghfirul Azhima Yoga; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6737

Abstract

The Indonesia Smart Card (KIP) Scholarship Program aims to support students from underprivileged families in pursuing higher education, yet the distribution of recipient data often experiences class imbalance, leading to inaccuracies in scholarship allocation. This imbalance, characterized by disproportionate data between recipient and non-recipient groups, affects classification model performance, causing models to favor the majority class and overlook the minority class, potentially excluding eligible recipients. To address this issue, this study combines the Genetic Algorithm for feature selection and optimization with Tomek Links-Random Undersampling for data balancing. The research process includes data preprocessing, 10-fold cross-validation, and performance evaluation using a confusion matrix. Results indicate that without Tomek Links-Random Undersampling, Naïve Bayes accuracy increased from 65.2% to 66.0% after feature selection and optimization using the Genetic Algorithm, while applying Tomek Links-Random Undersampling improved accuracy from 56% to 63%. This method also enhanced fairness in recipient classification, promoting a more equitable distribution of benefits. The improved model accuracy significantly aids future scholarship selection processes, demonstrating that integrating efficient machine learning approaches optimizes the KIP Scholarship Program by ensuring beneficiaries are appropriately targeted based on predetermined criteria.
Penerapan Metode GA-CBU Pada Algoritma Logistic Regression Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Poernamawan, Ahmad Nugraha; Siswa, Taghfirul Yoga Azhima; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6747

Abstract

The issue of class imbalance often poses a challenge in data analysis, where the number of instances in the majority class is significantly higher than that in the minority class. This can lead classification models to be biased towards predicting the majority class, resulting in low accuracy in identifying the minority class. This research aims to implement the Logistic Regression (LR) algorithm combined with the Clustering Based Undersampling (CBU) method as an undersampling technique, feature selection, and optimization using Genetic Algorithm (GA) in classifying KIP-College scholarship data at Muhammadiyah University of East Kalimantan. In addition, this research also evaluates the performance of the model with 10-Fold Cross Validation and Confusion Matrix techniques as accuracy metrics and aims to overcome the problem of class imbalance in the data of scholarship recipients (KIP) at Muhammadiyah University of East Kalimantan. The data used consists of 1075 records with 37 features related to the socio-economic factors of scholarship recipients. The results from the application of the CBU method indicate an increase in the accuracy of the Logistic Regression model from 62.51% to 67.68%. Furthermore, the combination of GA and CBU has providing more stable results in classifying minority classes. It is hoped that this research can make a significant contribution to the development of a more accurate and efficient scholarship recipient selection system, as well as serve as a reference for future studies in the fields of data mining and machine learning.
Penerapan Metode GA-NM Pada Algoritma SVM Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Abror, Irfan Fiqry; Siswa, Taghfirul Yoga Azhima; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6756

Abstract

Class imbalance is a common challenge in data analysis, especially when the number of instances in the majority class significantly exceeds that in the minority class. This imbalance can cause classification models to favor the majority class, resulting in low accuracy in identifying the minority class. In this study, the Support Vector Machine (SVM) method combined with Near Miss and Genetic Algorithm (GA) is used to address the class imbalance problem in the scholarship recipient data of the Kartu Indonesia Pintar (KIP) program at Universitas Muhammadiyah Kalimantan Timur. The dataset consists of 1,075 records with 27 features representing the socio-economic factors of the scholarship recipients. Near Miss was applied to undersample the majority class, producing a more balanced data distribution. Subsequently, the SVM algorithm was utilized as the primary classification model, with feature selection and parameter optimization conducted using GA. The results indicate that the combination of SVM, Near Miss, and GA improved classification performance in identifying the minority class. The initial accuracy obtained without the method was 60.55% and after implementation it increased to 76.88%. This approach not only enhances the overall accuracy of the model but also ensures more stable performance, particularly for the minority class. Therefore, this study is expected to provide a significant contribution to the development of a more accurate and efficient scholarship selection system, as well as serve as a reference for future research in data mining and machine learning.
Penerapan Metode GA-RU Pada Algoritma Random Forest Untuk Mengatasi Class Imbalance Data Beasiswa KIP-Kuliah Rahman, Febrian Nor; Siswa, Taghfirul Azhima Yoga; Rudiman, Rudiman
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6757

Abstract

Class imbalance is a common challenge in data analysis, where the majority class significantly outnumbers the minority class. This condition causes classification models to lean toward predicting the majority class, resulting in low accuracy in identifying the minority class. This study proposes the application of Genetic Algorithm (GA) combined with Random Undersampling (RU) on the Random Forest algorithm to address class imbalance issues in the dataset of Indonesia Smart Card (KIP) scholarship recipients at Universitas Muhammadiyah Kalimantan Timur. The dataset comprises 1,080 records with 37 features related to the socio-economic factors of the scholarship recipients. After data cleaning, 1,075 records were retained. The results indicate that the Random Undersampling method improved the accuracy of the Random Forest model from 84.27% to 85.06%. Although this improvement appears modest, it is significant as it demonstrates increased model stability in classifying the minority class, which previously had low accuracy. The combination of GA and RU proved effective in enhancing model performance, resulting in more stable classification for the minority class. This study is expected to contribute to the development of more accurate and efficient scholarship selection systems and serve as a reference for research in data mining and machine learning.
Penerapan PSO–RU Dalam Algoritma Naive Bayes Untuk Mengatasi Class Imbalance Data Bencana Tanah Longsor Akbar, Zakaria Ihza; Siswa, Taghfirul Azhima Yoga
Building of Informatics, Technology and Science (BITS) Vol 6 No 4 (2025): March 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i4.6888

Abstract

Abstract−Landslides are one of the most significant natural disasters in Indonesia, often causing substantial economic losses and threats to human safety. A key challenge in processing landslide data is the issue of class imbalance, where the number of disaster occurrence data is significantly smaller compared to non-disaster data. This study aims to improve landslide prediction accuracy by integrating the Naive Bayes algorithm and Particle Swarm Optimization (PSO) while employing the Random Undersampling (RU) technique to address data imbalance. The dataset used in this study includes landslide data from Samarinda City for the period 2022-2023, obtained from the Regional Disaster Management Agency (BPBD) and the Meteorology, Climatology, and Geophysics Agency (BMKG). The research process involved data preprocessing, balancing data using RU, implementing the Naive Bayes algorithm, and optimizing it with PSO. Model performance was evaluated using the 10-Fold Cross Validation technique and a confusion matrix. The results show that applying the Naive Bayes algorithm with PSO optimization without RU achieved the highest average accuracy of 89.49%, compared to Naive Bayes without optimization, which only reached 87.59%. Meanwhile, the application of RU showed varied effects, with the combination of Naive Bayes + PSO with RU achieving an average accuracy of 50%, slightly better than Naive Bayes with RU, which only reached 45%. This study demonstrates that PSO optimization can improve the performance of the Naive Bayes model in handling complex landslide datasets, although balancing techniques such as RU must be applied cautiously to avoid the loss of important information. The results of this study are expected to support disaster mitigation efforts through more accurate predictions, aiding stakeholders in decision-making, such as early evacuation planning and infrastructure development in landslide-prone areas.