Claim Missing Document
Check
Articles

Found 37 Documents
Search

Optimasi Algoritma Decision Tree Menggunakan GridSearchCV untuk Klasifikasi Tipe Obesitas Laurent, Feby; Winarno, Sri; Dewi, Ika Novita
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8638

Abstract

The rise in obesity cases in various countries, including Indonesia, has become a serious public health problem because it increases the risk of chronic diseases and affects individuals' psychological aspects. One of the main challenges in obesity management is the differences in obesity types in each individual, which are influenced by various factors. Therefore, accurate classification methods are needed to ensure more targeted treatment. In this context, machine learning-based technology is a potential solution for classifying obesity types. However, variations in individual characteristics make the classification process complex, as models often struggle to accurately distinguish obesity types. To overcome this problem, the Decision Tree algorithm was chosen because of its easy-to-interpret results. However, using Decision Tree with default parameters on datasets with many attributes and high variation tends to cause overfitting and decrease accuracy. Furthermore, Decision Tree performance is highly dependent on hyperparameter settings, requiring optimization techniques to achieve optimal results. Based on this, this study aims to optimize the Decision Tree algorithm using GridSearchCV to obtain the most optimal parameters to improve model performance in obesity type classification. The dataset used is from the UCI Machine Learning Repository, consisting of 2,111 rows of data and 17 attributes. Based on the initial test results, the default model achieved 92.58% accuracy, 92.58% recall, 92.66% precision, and 92.56% F1-score. After optimization, the accuracy increased to 95.69%, 95.69% recall, 95.72% precision, and 95.67% F1-score. The 3.1% increase in accuracy demonstrates the effectiveness of GridSearchCV in improving Decision Tree performance, resulting in a more accurate and stable prediction model. This research is expected to contribute as a basis for decision-making in early detection and prevention and treatment of obesity more efficiently and effectively.
Perbandingan Metode Seleksi Fitur Chi-Square dan Information Gain untuk Peningkatan Interpretabilitas dan Optimasi Kinerja Model TabNet Salsabilla, Annisa Ratna; Sani, Ramadhan Rakhmat; Dewi, Ika Novita
Jurnal Nasional Teknologi dan Sistem Informasi Vol 11 No 3 (2025): Desember 2025
Publisher : Departemen Sistem Informasi, Fakultas Teknologi Informasi, Universitas Andalas

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25077/TEKNOSI.v11i3.2025.253-262

Abstract

Breast cancer is one of the most significant global health issues. Machine learning approaches offer the potential to accurately analyze clinical data and aid in early diagnosis. However, conventional machine learning models are often limited in their ability to model complex nonlinear relationships in medical data, which can reduce predictive accuracy. This study employs a deep learning architecture because of its ability to model such relationships. Specifically, the TabNet model was chosen because it is designed for tabular data and offers better interpretability. The public Wisconsin Diagnostic Breast Cancer (WDBC) dataset, which has 30 features and an imbalanced class distribution, was used in this study. Feature selection was necessary to handle the high-dimensional data, and SMOTE-ENN was used for class balancing. Two feature selection methods, Chi-Square and Information Gain, were compared to determine the most effective approach. Hyperparameter optimization was performed using Optuna and validated with stratified k-fold cross-validation to ensure optimal performance. The results of the experiment demonstrate that feature selection and optimization significantly improve performance. The base model with Chi-Square feature selection achieved an accuracy rate of 64.91%. Meanwhile, the Chi-Square model with Optuna optimization increased accuracy to 98.25%. This is 3.51% higher than the accuracy of 94.74% achieved by the optimized model without feature selection. In the final comparison, both methods demonstrated distinct advantages: Chi-Square (75% features) excelled in achieving 100% precision and more efficient computation time. Information Gain (75% features), on the other hand, was the only method to achieve 100% recall, which is crucial for minimizing false negatives. These results demonstrate that the optimal method depends on the context. Information Gain is best for maximum diagnostic sensitivity, and Chi-Square is best for performance balance and efficiency.
Comparative Evaluation of Machine Learning Algorithms with Data Balancing Approach and Hyperparameter Tuning in Predicting Thyroid Disorder Recurrence Darnell Ignasius; Rhyan David Levandra; Ramadhan Rakhmat Sani; Ika Novita Dewi
Jurnal Masyarakat Informatika Vol 16, No 2 (2025): November 2025
Publisher : Department of Informatics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/jmasif.16.2.75073

Abstract

This research evaluates and compares the performance of five machine learning algorithms (Logistic Regression, K-Nearest Neighbors, Decision Tree, Random Forest, and Gradient Boosting) in predicting thyroid disease recurrence using patient data. The analysis was conducted on the Thyroid Disease Dataset from the UCI Machine Learning Repository. The methodology includes data preprocessing, normalization, and class balancing with the Synthetic Minority Over-sampling Technique (SMOTE). Additionally, hyperparameter tuning was conducted using GridSearchCV to optimize model performance. The results demonstrate that ensemble-based models, specifically Random Forest and Gradient Boosting, consistently outperform the other algorithms in terms of accuracy and robustness. These models achieve 95–96% accuracy across various scenarios.A key finding is that SMOTE significantly improves recall for minority classes, highlighting its value in imbalanced medical datasets.
Optimalisasi Perilaku Hidup Bersih dan Sehat Melalui Aplikasi Kesehatan di SMP Ibu Kartini Subhiyakto, Egia Rosi; Rakasiwi, Sindhu; Dewi, Ika Novita; Zeniarja, Junta; Octaviani, Dhita Aulia; Salam, Abu; Fitriyani, Shelomita; Safira, Almira Zuhrotus
ABDIMASKU : JURNAL PENGABDIAN MASYARAKAT Vol 9, No 1 (2026): JANUARI 2026
Publisher : LPPM UNIVERSITAS DIAN NUSWANTORO

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62411/ja.v9i1.3229

Abstract

Program Perilaku Hidup Bersih dan Sehat (PHBS) merupakan upaya penting dalam mendorong penerapan pola hidup sehat guna menjaga, merawat, serta meningkatkan derajat kesehatan. Penerapan gaya hidup sehat dapat mencegah berbagai penyakit yang berpotensi muncul di masyarakat. PHBS sangat tepat dikenalkan sejak usia sekolah, karena anak-anak termasuk kelompok yang rentan terhadap gangguan kesehatan akibat berbagai faktor. Perkembangan teknologi dalam bidang pendidikan telah terbukti mampu mengubah proses interaksi dan pembelajaran di kelas menjadi lebih efektif, efisien, mudah diakses, serta mendukung pengembangan keterampilan yang dibutuhkan di era digital, baik saat ini maupun di masa mendatang. Pemanfaatan aplikasi digital sebagai hasil perkembangan teknologi telah banyak diterapkan di bidang kesehatan dan pendidikan, yang keduanya saling berkaitan dan mendukung satu sama lain. Penyampaian informasi kesehatan membutuhkan peran pendidikan, sementara proses pendidikan juga tidak dapat berjalan optimal tanpa lingkungan yang sehat. Oleh karena itu, keberadaan teknologi dalam kedua bidang tersebut menjadi sangat krusial. Berdasarkan uraian tersebut, diperlukan pemberian pengetahuan mengenai PHBS kepada para siswa. Selain pemahaman secara teori, santri juga perlu mendapatkan pendampingan dalam penerapan PHBS secara langsung, serta dukungan teknologi berupa aplikasi digital agar proses pembelajaran menjadi lebih menarik dan efektif. Sebelum penerapan aplikasi tersebut, diperlukan sosialisasi dan pelatihan bagi pengasuh pondok pesantren terkait penggunaannya. Atas dasar pertimbangan tersebut, tim berinisiatif melaksanakan kegiatan Pengabdian Kepada Masyarakat dengan tema Pendampingan PHBS pada Siswa melalui Sosialisasi Aplikasi Digital yang berlokasi di SMP Ibu Kartini. Kegiatan ini diharapkan mampu membentuk kebiasaan PHBS dalam kehidupan sehari-hari santri serta mendorong mereka untuk menularkan perilaku positif tersebut kepada lingkungan sekitarnya.
Exploring Public Opinion on the 'Makan Bergizi Gratis' Program on X: A Comparative Analysis of IndoBERT-Large and NusaBERT-Large Models Arunia, Aurelya Prameswari; Sani, Ramadhan Rakhmat; Dewi, Ika Novita; Sulistyono, MY Teguh
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.11757

Abstract

Program Makan Bergizi Gratis (MBG) has triggered extensive discourse on social media platform X, which serves as a primary space for public expression of opinions toward government policies. This study aims to analyze public sentiment toward the MBG program while simultaneously comparing the performance of two prominent Transformer-based models, namely IndoBERT-Large and NusaBERT-Large. This research adopts a quantitative approach employing supervised learning on 10,201 Indonesian-language posts (tweets) collected through web scraping from February 2024 to September 2025. A total of 2,000 samples were manually annotated as ground truth, achieving a high level of inter-annotator reliability (Cohen’s Kappa, κ = 0.81). The experimental results indicate that IndoBERT-Large outperforms NusaBERT-Large, achieving an accuracy of 83.00%, while NusaBERT-Large demonstrates competitive performance with an accuracy of 80.50%. Substantively, public discourse is dominated by negative sentiment, accounting for nearly 50% of the total data, reflecting public concerns regarding budgetary constraints and technical implementation issues. Positive sentiment ranges between 33% and 36%, indicating sustained and substantial public support for the program. These findings confirm the effectiveness of Transformer-based models in accurately capturing the dynamics of public opinion toward government policies using social media data.
Optimizing Bankruptcy Prediction on Imbalanced Data using XGBoost with Random Oversampling and Chi-Square Suyatno, Revalina; Udayanti, Erika Devi; Dewi, Ika Novita
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.11841

Abstract

In the midst of modern financial dynamics, the ability to predict corporate bankruptcy holds strategic significance, as it directly affects economic stability and investor confidence. However, the development of a reliable predictive model is often hindered by the complex nature of financial data, particularly the class imbalance between bankrupt and non-bankrupt companies. This imbalance causes models to become biased toward the majority class, thereby reducing their sensitivity in detecting bankruptcy cases which are, in fact, the most critical for financial decision-making. This research aims to construct a more balanced and sensitive bankruptcy prediction model by specifically addressing the issue of data imbalance. The proposed approach integrates the Random Oversampling (ROS) technique to equalize class distribution, Chi-Square feature selection to identify the most informative financial variables, and the Extreme Gradient Boosting (XGBoost) algorithm as the core predictive model. The dataset used is the UCI Taiwanese Bankruptcy Prediction dataset, consisting of 6,819 observations and 96 financial ratio variables. Experimental results show that the Chi-Square method successfully identified 20 influential variables, including Per Share Net Profit Before, Debt Ratio, and ROA(B) Before Interest and Depreciation After Tax. The proposed XGBoost model achieved an overall accuracy of 0.9648 and an F1-score of 0.4286, demonstrating superior performance. These findings confirm that the combination of ROS, Chi-Square, and XGBoost effectively enhances data balance and prediction sensitivity for the bankruptcy class. This research is expected to serve as a foundation for developing financial decision-support systems capable of providing early warnings of potential corporate bankruptcy.
Hybrid Rainfall Analysis in Semarang by Integrating SARIMA Predictions with Meteorological Association Rules Agustin, Kristina; Novita Dewi, Ika
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.12013

Abstract

Climate variability necessitates advanced analytical approaches to understand irregular rainfall patterns, particularly in coastal cities like Semarang, Central Java. This research employs a dual-analysis framework combining the Seasonal Autoregressive Integrated Moving Average (SARIMA) model and the Apriori algorithm to forecast rainfall and uncover hidden meteorological associations. Analyzing BMKG monthly climatological data from January 2020 to December 2024, the research addresses both temporal trends and variable dependencies. The SARIMA 〖(1,0,0)(2,1,0)〗_12 model projected rainfall dynamics for 2025, identifying critical wet periods (January-March, November-December) and dry intervals (July-September), achieving a MAPE of 44.97%. To complement temporal forecasting, the Apriori algorithm was applied with 50% minimum support and 50% confidence, generating association rules from daily discretized meteorological data. Results reveal that the combination of low temperature (Tx_Low, Tn_Low) and moderate wind speed (FFx_Medium) exhibits the strongest correlation with heavy rainfall events Lift Ratio 12.34, indicating a 12-fold increased risk compared to random conditions. By synergizing temporal forecasting with the identification of meteorological triggers, this research offers a robust basis for early warning systems, supporting flood mitigation and water resource management strategies in Semarang.
Co-Authors Abas Setiawan Abdul Syukur Abdul Syukur Abu Salam Adhitya Nugraha Adriani, Mira Riezky Agung Priyo Utomo, Rino Agustin, Kristina Alzami, Farrikh Ardytha Luthfiarta Arifin, Muhammad Farhan Arry Maulana Syarif, Arry Maulana Arunia, Aurelya Prameswari Asih Rohmani, Asih Atha Rohmatullah, Fawwaz Ayuningsih, Dewi Putri Azhari Azhari Bramantyo, Satrio Bisma Candra Irawan Catur Supriyanto Darnell Ignasius Diana Aqmala Dwi Puji Prabowo, Dwi Puji Dzaki, Azmi Abiyyu Egia Rosi Subhiyakto, Egia Rosi Erika Devi Udayanti Erwin Yudi Hidayat Erwin Yudi Hidayat Fahri Firdausillah Fajar Agung Nugroho Fitriyani, Shelomita Hafiizhudin, Lutfi Azis Handayani, Sri Haresta, Alif Agsakli Hasan Asari Heribertus Himawan Ifan Rizqa Indrayani, Heni Irawan, Enrico Irvan Muzakkir Irvan Muzakkir Isworo, Slamet Junta Zeniarja Khafiizh Hastuti Khariroh, Shofiyatul Kurniawan, Defri Laurent, Feby Lisa Mardiana Marjuni, Aris Megantara, Rama Aria Muljono Muljono Mumtaz, Najma Amira MY. Teguh Sulistyono Norman, Maria Bernadette Chayeenee Octaviani, Dhita Aulia Priyo Utomo, Rino Agung Puri Sulistiyawati Pusung, Elvanro Marthen Ramadhan Rakhmat Sani Reza, Ivan Muhammad Rhyan David Levandra Ricardus Anggi Pramunendar Rifamuthia, Titis Ritzkal, Ritzkal Safira, Almira Zuhrotus Salsabilla, Annisa Ratna Saputra, Filmada Ocky Sholikun, Sholikun Sindhu Rakasiwi Sri Winarno Subowo, Moh Hadi Sulistyono, Teguh Suyatno, Revalina Syarifah, Ulima Muna Utomo, Danang Wahyu Wellia Shinta Sari Wibowo, Isro' Rizky Yanuaresta, Dianna Zainal Arifin Hasibuan