Claim Missing Document
Check
Articles

Found 32 Documents
Search

Optimasi Algoritma Decision Tree Menggunakan GridSearchCV untuk Klasifikasi Tipe Obesitas Laurent, Feby; Winarno, Sri; Dewi, Ika Novita
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8638

Abstract

The rise in obesity cases in various countries, including Indonesia, has become a serious public health problem because it increases the risk of chronic diseases and affects individuals' psychological aspects. One of the main challenges in obesity management is the differences in obesity types in each individual, which are influenced by various factors. Therefore, accurate classification methods are needed to ensure more targeted treatment. In this context, machine learning-based technology is a potential solution for classifying obesity types. However, variations in individual characteristics make the classification process complex, as models often struggle to accurately distinguish obesity types. To overcome this problem, the Decision Tree algorithm was chosen because of its easy-to-interpret results. However, using Decision Tree with default parameters on datasets with many attributes and high variation tends to cause overfitting and decrease accuracy. Furthermore, Decision Tree performance is highly dependent on hyperparameter settings, requiring optimization techniques to achieve optimal results. Based on this, this study aims to optimize the Decision Tree algorithm using GridSearchCV to obtain the most optimal parameters to improve model performance in obesity type classification. The dataset used is from the UCI Machine Learning Repository, consisting of 2,111 rows of data and 17 attributes. Based on the initial test results, the default model achieved 92.58% accuracy, 92.58% recall, 92.66% precision, and 92.56% F1-score. After optimization, the accuracy increased to 95.69%, 95.69% recall, 95.72% precision, and 95.67% F1-score. The 3.1% increase in accuracy demonstrates the effectiveness of GridSearchCV in improving Decision Tree performance, resulting in a more accurate and stable prediction model. This research is expected to contribute as a basis for decision-making in early detection and prevention and treatment of obesity more efficiently and effectively.
Perbandingan Metode Seleksi Fitur Chi-Square dan Information Gain untuk Peningkatan Interpretabilitas dan Optimasi Kinerja Model TabNet Salsabilla, Annisa Ratna; Sani, Ramadhan Rakhmat; Dewi, Ika Novita
Jurnal Nasional Teknologi dan Sistem Informasi Vol 11 No 3 (2025): Desember 2025
Publisher : Departemen Sistem Informasi, Fakultas Teknologi Informasi, Universitas Andalas

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25077/TEKNOSI.v11i3.2025.253-262

Abstract

Breast cancer is one of the most significant global health issues. Machine learning approaches offer the potential to accurately analyze clinical data and aid in early diagnosis. However, conventional machine learning models are often limited in their ability to model complex nonlinear relationships in medical data, which can reduce predictive accuracy. This study employs a deep learning architecture because of its ability to model such relationships. Specifically, the TabNet model was chosen because it is designed for tabular data and offers better interpretability. The public Wisconsin Diagnostic Breast Cancer (WDBC) dataset, which has 30 features and an imbalanced class distribution, was used in this study. Feature selection was necessary to handle the high-dimensional data, and SMOTE-ENN was used for class balancing. Two feature selection methods, Chi-Square and Information Gain, were compared to determine the most effective approach. Hyperparameter optimization was performed using Optuna and validated with stratified k-fold cross-validation to ensure optimal performance. The results of the experiment demonstrate that feature selection and optimization significantly improve performance. The base model with Chi-Square feature selection achieved an accuracy rate of 64.91%. Meanwhile, the Chi-Square model with Optuna optimization increased accuracy to 98.25%. This is 3.51% higher than the accuracy of 94.74% achieved by the optimized model without feature selection. In the final comparison, both methods demonstrated distinct advantages: Chi-Square (75% features) excelled in achieving 100% precision and more efficient computation time. Information Gain (75% features), on the other hand, was the only method to achieve 100% recall, which is crucial for minimizing false negatives. These results demonstrate that the optimal method depends on the context. Information Gain is best for maximum diagnostic sensitivity, and Chi-Square is best for performance balance and efficiency.