Claim Missing Document
Check
Articles

Hybrid Feature Selection and Balancing Data Approach for Improved Software Defect Prediction Febrian, Muhamad Michael; Saputro, Setyo Wahyu; Saragih, Triando Hamonangan; Abadi, Friska; Herteno, Rudy
Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 7 No. 2 (2025): May
Publisher : Jurusan Teknik Elektromedik, Politeknik Kesehatan Kemenkes Surabaya, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/ijeeemi.v7i2.67

Abstract

Software Defect Prediction (SDP) plays a vital role in identifying defects within software modules. Accurate early detection of software defects can reduce development costs and enhance software reliability. However, SDP remains a significant challenge in the software development lifecycle. This study employs Particle Swarm Optimization (PSO) and addresses several challenges associated with its application, including noisy attributes, high-dimensional data, and imbalanced class distribution. To address these challenges, this study proposed a hybrid filter-based feature selection and class balancing method. The feature selection process incorporates Chi-Square (CS), Correlation-Based Feature Selection (CFS), and Correlation Matrix-Based Feature Selection (CMFS), which have been proven effective in reducing noisy and redundant attributes. Additionally, the Synthetic Minority Over-sampling Technique (SMOTE) is applied to mitigate class imbalance in the dataset. The K-Nearest Neighbors (KNN) algorithm is employed as the classification model due to its simplicity, non-parametric nature, and suitability for handling the feature subsets produced. Performance evaluation is conducted using the Area Under Curve (AUC) metric with a significance threshold of 0.05 to assess classification capability.  The proposed method achieved an AUC of 0.872, demonstrating its effectiveness in enhancing predictive performance. The proposed method was also superior to other combinations such as PSO SMOTE (0.0043), PSO SMOTE CS (0.0091), PSO SMOTE CFS (0.0111), and PSO SMOTE CFS CMFS (0.0007). The findings of this study show that the proposed method significantly enhances the efficiency and accuracy of PSO in software defect prediction tasks. This hybrid strategy demonstrates strong potential as a robust solution for future research and application in predictive software quality assurance.
Enhancing Software Defect Prediction: HHO-Based Wrapper Feature Selection with Ensemble Methods Fauzan Luthfi, Achmad; Herteno, Rudy; Abadi, Friska; Adi Nugroho, Radityo; Itqan Mazdadi, Muhammad; Athavale, Vijay Anant
Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 7 No. 2 (2025): May
Publisher : Jurusan Teknik Elektromedik, Politeknik Kesehatan Kemenkes Surabaya, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/f2140043

Abstract

The growing complexity of data across domains highlights the need for effective classification models capable of addressing issues such as class imbalance and feature redundancy. The NASA MDP dataset poses such challenges due to its diverse characteristics and highly imbalanced classes, which can significantly affect model accuracy. This study proposes a robust classification framework integrating advanced preprocessing, optimization-based feature selection, and ensemble learning techniques to enhance predictive performance. The preprocessing phase involved z-score standardization and robust scaling to normalize data while reducing the impact of outliers. To address class imbalance, the ADASYN technique was employed. Feature selection was performed using Binary Harris Hawk Optimization (BHHO), with K-Nearest Neighbor (KNN) used as an evaluator to determine the most relevant features. Classification models including Random Forest (RF), Support Vector Machine (SVM), and Stacking were evaluated using performance metrics such as accuracy, AUC, precision, recall, and F1-measure. Experimental results indicated that the Stacking model achieved superior performance in several datasets, with the MC1 dataset yielding an accuracy of 0.998 and an AUC of 1.000. However, statistical significance testing revealed that not all observed improvements were meaningful; for example, Stacking significantly outperformed SVM but did not show a significant difference when compared to RF in terms of AUC. This underlines the importance of aligning model choice with dataset characteristics. In conclusion, the integration of advanced preprocessing and metaheuristic optimization contributes positively to software defect prediction. Future research should consider more diverse datasets, alternative optimization techniques, and explainable AI to further enhance model reliability and interpretability.
Functional Evaluation of the Logia Dashboard Using Boundary Value Testing and Cause-Effect Graph Techniques Ramadhan, Muhammad Rizky Aulia; Abadi, Friska; Nugrahadi, Dodon Turianto; Saputro, Setyo Wahyu; Herteno, Rudy
Telematika Vol 18, No 2: August (2025)
Publisher : Universitas Amikom Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35671/telematika.v18i2.3121

Abstract

The Logia Dashboard is a web-based information system used to manage rehabilitation plant data on post-mining land. As an alpha-stage system, Logia requires thorough functional and performance evaluation to ensure that all input validations, logical processes, and system responses operate correctly before wider implementation. This study aims to evaluate the functional reliability and performance of the Logia Dashboard by applying a combined approach of Boundary Value Testing (BVT) and Cause-Effect Graph (CEG) techniques, supported by performance testing using Google Lighthouse. The research design adopts a black-box testing approach. BVT is applied to validate input boundaries on critical features, including login, data editing, QR code generation, and account creation. Meanwhile, CEG is used to model logical relationships between input conditions and system outputs to generate systematic test cases. A total of 39 optimized functional test cases were executed in a controlled local environment. Performance testing was conducted using Lighthouse by measuring key metrics such as First Contentful Paint (FCP), Largest Contentful Paint (LCP), Total Blocking Time (TBT), and Cumulative Layout Shift (CLS). The functional testing results show that 37 out of 39 test cases passed, yielding a success rate of 94.87%. Two failed cases were identified in the login feature, indicating weaknesses in input validation feedback. Performance testing produced an average Lighthouse score of 97, demonstrating that the system has excellent load speed and interface stability, although minor layout instability was detected on certain pages. These results indicate that the combined application of BVT and CEG is effective for detecting boundary-related and logical input errors in alpha-stage web systems. The findings also provide concrete recommendations for improving login validation and interface stability, supporting further development of the Logia Dashboard toward a more reliable and robust system for post-mining land management.
Comparative Performance Evaluation of Linear, Bagging, and Boosting Models Using BorutaSHAP for Software Defect Prediction on NASA MDP Datasets Kartika, Najla Putri; Herteno, Rudy; Budiman, Irwan; Nugrahadi, Dodon Turianto; Abadi, Friska; Ahmad, Umar Ali; Faisal, Mohammad Reza
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 6 (2025): JUTIF Volume 6, Number 6, Desember 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.6.5393

Abstract

Software defect prediction aims to identify potentially defective modules early on in order to improve software reliability and reduce maintenance costs. However, challenges such as high feature dimensions, irrelevant metrics, and class imbalance often reduce the performance of prediction models. This research aims to compare the performance of three classification model groups—linear, bagging, and boosting—combined with the BorutaSHAP feature selection method to improve prediction stability and interpretability. A total of twelve datasets from the NASA Metrics Data Program (MDP) were used as test references. The research stages included data preprocessing, class balancing using the Synthetic Minority Oversampling Technique (SMOTE), feature selection with BorutaSHAP, and model training using five algorithms, namely Logistic Regression, Linear SVC, Random Forest, Extra Trees, and XGBoost. The evaluation was conducted with Stratified 5-Fold Cross-Validation using the F1-score and Area Under the Curve (AUC) metrics. The experimental results showed that tree-based ensemble models provided the most consistent performance, with Extra Trees recording the highest average AUC of 0.794 ± 0.05, followed by Random Forest (0.783 ± 0.06). The XGBoost model provided the best results on the PC4 dataset (AUC = 0.937 ± 0.008), demonstrating its ability to handle complex data patterns. These findings prove that BorutaSHAP is effective in filtering relevant features, improving classification reliability, and strengthening transparency and interpretability in the Explainable Artificial Intelligence (XAI) framework for software quality improvement.
Multi-Criteria Decision Making dalam Seleksi Fitur Ensemble untuk Prediksi Cacat Perangkat Lunak Fikri, Muhammad; Herteno, Rudy; Adi Nugroho, Radityo; Wahyu Saputro, Setyo; Abadi, Friska
Jurnal Teknologi Informasi dan Ilmu Komputer Vol 12 No 6: Desember 2025
Publisher : Fakultas Ilmu Komputer, Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25126/jtiik.2025125

Abstract

Prediksi cacat perangkat lunak merupakan upaya strategis dalam meningkatkan kualitas produk melalui identifikasi dini modul yang berpotensi cacat. Kinerja prediksi dipengaruhi oleh pemilihan fitur, karena informasi yang berlebihan dan tidak relevan dapat mempengaruhi kualitas pembelajaran model. Seleksi fitur ensemble dinilai efektif dalam menyeleksi fitur yang relevan dengan menggabungkan beberapa metode seleksi fitur berbasis filter. Diperlukan mekanisme integrasi untuk menyatukan hasil dari empat teknik filter—Mutual Information, Fisher Score, Uncertainty dan Relief. Penelitian ini membandingkan empat metode Multi‑Criteria Decision Making—TOPSIS, VIKOR, EDAS, dan WASPAS—yang bekerja dengan merangking nilai relevansi fitur hasil seleksi filter tersebut. Sepuluh fitur teratas dari tiap metode kemudian dievaluasi menggunakan model Random Forest dengan metrik AUC melalui K‑Fold cross‑validation. Dari 12 dataset NASA MDP yang diuji, TOPSIS menunjukkan kinerja paling konsisten dan terbaik dengan nilai rata-rata AUC sebesar 0,8038. Temuan ini menegaskan pentingnya pemilihan metode integrasi yang tepat dalam meningkatkan akurasi prediksi cacat perangkat lunak dan memberikan panduan bagi pengembangan model yang lebih efektif.   Abstract Software defect prediction is a strategic effort to improve product quality through early identification of potentially defective modules. Prediction performance is influenced by feature selection, because redundant and irrelevant information can affect the quality of model learning. Ensemble feature selection is considered effective in selecting relevant features by combining several filter-based feature selection methods. An integration mechanism is needed to unify the results of four filter techniques—Mutual Information, Fisher Score, Uncertainty and Relief. This study compares four Multi-Criteria Decision Making methods—TOPSIS, VIKOR, EDAS, and WASPAS—which work by ranking the relevance values ​​of the filter-selected features. The top ten features from each method are then evaluated using the Random Forest model with the AUC metric through K-Fold cross-validation. Of the 12 NASA MDP datasets tested, TOPSIS showed the most consistent and best performance with an average AUC value of 0.8038. These findings emphasize the importance of choosing the right integration method in improving the accuracy of software defect prediction and provide guidance for the development of more effective models.
Comparative Study of Filter, Wrapper, and Hybrid Feature Selection Using Tree-Based Classifiers for Software Defect Prediction Rahmayanti, Rahmayanti; Herteno, Rudy; Saputro, Setyo Wahyu; Saragih, Triando Hamonangan; Abadi, Friska
Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 8 No. 1 (2026): February
Publisher : Jurusan Teknik Elektromedik, Politeknik Kesehatan Kemenkes Surabaya, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/ijeeemi.v8i1.294

Abstract

Software defect prediction (SDP) is essential for improving software reliability by enabling the early identification of modules that may contain defects before the release stage. SDP commonly exhibits redundant or non-contributory metrics, underscoring the need for feature selection to derive a more informative subset. To address this problem, the present study investigates and compares the effectiveness of three feature-selection strategies: SelectKBest (SKB), Recursive Feature Elimination (RFE), and the hybrid SKB+RFE, in enhancing the performance of tree-based classifiers on the NASA Metrics Data Program (MDP) data collections. The study utilizes three classification algorithms, namely Random Forest (RF), Extra Trees (ET), and Bagging (Decision Tree), with Area Under the Curve (AUC) serving as the primary metric for assessing model performance. Experimental results reveal that the RFE and Extra Trees combination yields the top performance, producing an average AUC of 0.7855. This is subsequently followed by the SKB+RFE+ET configuration, which achieves an AUC of 0.7809, and SKB+ET at 0.7776. These findings demonstrate that iterative wrapper-based approaches such as RFE can identify more relevant and effective feature subsets than filter or hybrid strategies, with the RFE+Extra Trees configuration yielding the strongest overall predictive performance and wrapper-based methods exhibiting higher stability across heterogeneous datasets. Even without hyperparameter tuning and relying solely on class-weighting rather than explicit resampling techniques, the findings offer empirical insight into the isolated influence of feature selection on predictive performance. Overall, the study confirms that RFE combined with Extra Trees offers the strongest predictive performance on NASA MDP data collections and forms a foundation for developing more adaptive and robust models.
Analisis Sentimen Ulasan Media Sosial UMKM Kuliner dengan Pendekatan Lexicon-Based dan Kosakata Khusus Setyo Wahyu Saputro; Friska Abadi; Radityo Adi Nugroho
Jurnal Informatika Polinema Vol. 12 No. 2 (2026): Vol. 12 No. 2 (2026)
Publisher : UPT P2M State Polytechnic of Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33795/jip.v12i2.9302

Abstract

UMKM kuliner di Kalimantan Selatan memanfaatkan media sosial sebagai sarana utama untuk mengetahui opini pelanggan, namun jumlah komentar yang sangat besar menyulitkan pelaku usaha untuk menelaahnya secara manual. Kondisi ini menegaskan perlunya pendekatan analisis sentimen yang mampu mengolah data ulasan secara efisien serta sesuai dengan karakteristik bahasa lokal. Penelitian ini bertujuan mengembangkan metode analisis sentimen berbasis lexicon yang diperkaya dengan kosakata domain-spesifik kuliner dan bahasa Banjar agar hasil klasifikasi lebih akurat dan kontekstual. Data penelitian diperoleh dari 3.500 komentar publik di Instagram dan TikTok. Tahap preprocessing mencakup case folding, pembersihan karakter khusus, tokenisasi, stopword removal, normalisasi, dan stemming. Selanjutnya, InSet Lexicon disempurnakan melalui penyuntikan kosakata baru serta penyesuaian bobot kata sesuai konteks kuliner lokal. Hasil analisis menunjukkan distribusi sentimen terdiri dari 2.050 komentar positif (58,57%), 934 komentar netral (26,69%), dan 516 komentar negatif (14,74%). Evaluasi menunjukkan peningkatan akurasi signifikan setelah perluasan lexicon, yaitu 93,49% untuk sentimen negatif, 94,64% untuk netral, dan 96,94% untuk positif, dibandingkan akurasi awal yang berkisar antara 51–73%. Temuan ini membuktikan bahwa pengayaan lexicon menggunakan kosakata lokal dan domain-spesifik secara substansial meningkatkan performa analisis sentimen. Pendekatan ini memberikan solusi praktis dan terjangkau bagi UMKM untuk memahami opini pelanggan secara lebih representatif, serta dapat dimanfaatkan dalam pengambilan keputusan strategis dan perbaikan kualitas layanan maupun promosi produk kuliner.
Comparison Between K-Fold Cross Validation And Percentage Split In Decision Tree Algorithms For Anemia Classification Rahmawati, Nanda Putri; Irwan Budiman; Muhammad Itqan Mazdadi; Andi Farmadi; Friska Abadi
Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 8 No. 1 (2026): February
Publisher : Jurusan Teknik Elektromedik, Politeknik Kesehatan Kemenkes Surabaya, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/ijeeemi.v8i1.315

Abstract

Anemia is a significant global health challenge characterized by a pathological deficit in hemoglobin concentration, often leading to physiological instability. Accurate clinical diagnosis typically relies on complete blood count (CBC) tests, which provide critical hematological parameters for classification. While machine learning models have demonstrated high efficacy in diagnosing anemia, existing research often relies on static data partitioning strategies that may overlook evaluation reliability and performance stability. This study addresses this gap by shifting the focus from architectural benchmarking to validation robustness, specifically evaluating the C4.5 algorithm's performance across different data-splitting techniques. The research uses a dataset comprising 1,281 clinical records with 14 numerical features and 9 anemia-type labels. To assess stability, two distinct partitioning strategies were implemented: a static Percentage Split (ranging from 60:40 to 90:10) and iterative K-Fold Cross Validation (with K values of 3, 5, 7, 10, and 15). Experimental results demonstrate that the C4.5 algorithm achieved its peak performance with the 90:10 Percentage Split, achieving an average accuracy of 99.46%, precision of 98.32%, and recall of 99.28%. In comparison, the K-Fold (K=10) approach yielded a slightly lower but more stable accuracy of 99.19% with a significantly reduced standard deviation (±0.09), highlighting its reliability for clinical applications. While the high-ratio percentage split maximizes training exposure and predictive potential, the K-Fold method provides a more objective, generalizable benchmark by accounting for the entire data distribution. The study further identifies challenges in classifying minority classes, such as Leukemia with thrombocytopenia, due to inherent data scarcity. Ultimately, this research confirms that the C4.5 algorithm, when paired with an optimal partitioning protocol, remains a robust and highly interpretable solution for clinical anemia screening, outperforming several complex modern architectures
The Effect of Smote-Tomek on the Classification of Chronic Diseases Based on Health and Lifestyle Data Muhammad Adika Riswanda; Friska Abadi; Muhammad Itqan Mazdadi; Mohammad Reza Faisal; Rudy Herteno
Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 8 No. 1 (2026): February
Publisher : Jurusan Teknik Elektromedik, Politeknik Kesehatan Kemenkes Surabaya, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/ijeeemi.v8i1.324

Abstract

Machine learning models for chronic disease prediction are often trained on imbalanced healthcare datasets, where non-disease cases dominate. This condition can lead to misleadingly high accuracy while failing to identify patients with chronic diseases, limiting clinical usefulness. This study aims to analyze the impact of class imbalance on model performance and to evaluate the effectiveness of the SMOTE–Tomek resampling technique in improving chronic disease prediction. This research provides empirical evidence that accuracy alone is insufficient for evaluating healthcare models and demonstrates that imbalance-aware preprocessing is essential for valid and reliable chronic disease detection. Five classification models, such as Support Vector Machine, Random Forest, K-Nearest Neighbors, Gradient Boosting, and XGBoost, were evaluated on a lifestyle-based chronic disease dataset under two conditions: without resampling and with SMOTE–Tomek. Model performance was assessed using accuracy, precision, recall, F1-score, and AUC. Without SMOTE–Tomek, all models failed to detect chronic disease cases, producing near-zero recall and F1-scores despite accuracy exceeding 80%. After applying SMOTE–Tomek, substantial improvements were observed across all models, particularly in recall and AUC. Support Vector Machine achieved the best overall performance, with an accuracy of 92.9%, a precision of 92%, a recall of 93.9%, an F1-score of 0.93, and an AUC of 0.98. The findings confirm that handling class imbalance is a prerequisite for meaningful chronic disease prediction. The consistent increase in recall and AUC across all evaluated models confirms that the improvement stems from enhanced class separability rather than metric inflation. The proposed approach supports more reliable early screening and decision-support systems in preventive healthcare
Empirical Performance of E2E Frameworks in React-Vue SPAs Using DIA Rezeki, Abdillah; Saputro, Setyo Wahyu; Saragih, Triando Hamonangan; Nugroho, Radityo Adi; Abadi, Friska
International Journal of Advances in Data and Information Systems Vol. 7 No. 1 (2026): April 2026 - International Journal of Advances in Data and Information Systems
Publisher : Indonesian Scientific Journal

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.59395/ijadis.v7i1.1528

Abstract

Modern web applications increasingly adopt Single-Page Application (SPA) architectures to enhance the user experience through client-side rendering and dynamic content loading. However, these characteristics introduce significant challenges for automated end-to-end (E2E) testing, including asynchronous DOM manipulation, complex state management, and timing synchronization issues. This study presents a comprehensive empirical comparison of three prominent E2E testing frameworks—Selenium WebDriver, Cypress, and Playwright—across React and Vue-based SPAs. Using a quantitative experimental approach, 25 standardized test cases were executed 15 times each across Chrome, Firefox, and Edge, for a total of 270 testing sessions. Performance evaluation focused on four key metrics: execution time, success rate, CPU usage, and memory consumption. Results demonstrate that Playwright achieved the fastest execution time (56.25 seconds on React-Chrome), while Selenium exhibited superior resource efficiency with the lowest memory consumption (196.59 MB on Vue-Chrome). The Distance to Ideal Alternative (DIA) multi-criteria decision analysis method identified Playwright-Chrome as optimal for React applications (DIA score: 0.886715) and Selenium-Chrome for Vue applications (DIA score: 0.908237), indicating that framework selection should be context-dependent based on application characteristics and deployment requirements. This research supports the conclusion that no universal "best" testing framework exists, underscoring the importance of evidence-based, application-specific tool selection in software quality assurance.
Co-Authors A.A. Ketut Agung Cahyawan W AA Sudharmawan, AA Abdullayev, Vugar Achmad Zainudin Nur Adi Mu'Ammar, Rifqi Aflaha, Rahmina Ulfah Ahmad Juhdi Amalia, Raisa Andi Farmadi Andi Farmadi Andi Farmandi Arif, Nuuruddin Hamid Athavale, Vijay Anant Bagaskara Ridho Vandio budiman, irwan Deni Kurnia Dodon Turianto Nugrahadi Dwi Kartini Dwi Kartini, Dwi Emma Andini Fatma Indriani Fauzan Luthfi, Achmad Febrian, Muhamad Michael Halimah Herteno, Rudy Herteno, Rudy Indriani, Fatma Irwan Budiman Irwan Budiman Irwan Budiman Itqan Mazdadi, Muhammad Kartika, Najla Putri M Kevin Warendra Mafazy, Muhammad Meftah Martalisa, Asri Maulana Abdul Rahman Mera Kartika Delimayanti Muhamad Fawwaz Akbar Muhammad Adika Riswanda Muhammad Alkaff Muhammad Alkaff Muhammad Alvin Alfando Muhammad Azmi Adhani Muhammad Denny Ersyadi Rahman Muhammad Fikri Muhammad Haekal Muhammad Itqan Mazdadi Muhammad Khairin Nahwan Muhammad Mirza Hafiz Yudianto Muhammad Nazar Gunawan Muhammad Noor Muhammad Reza Faisal, Muhammad Reza Muhammad Sholih Afif Muliadi Muliadi Muliadi Aziz Muliadi Muliadi Muliadi Muliadi Nabella, Putri Nor Indrani Nugrahadi, Dodon Nurlatifah Amini Nursyifa Azizah Prastya, Septyan Eka Pratama, Muhammad Yoga Adha Putri Nabella Radityo Adi Nugroho Rahman Hadi Rahman Rahmat Budianoor Rahmat Ramadhani Rahmawati, Nanda Putri Rahmayanti Rahmayanti Ramadhan, Muhammad Rizky Aulia Reina Alya Rahma Rezeki, Abdillah Rinaldi Riza Susanto Banner Rizal, Muhammad Nur Rizky Ananda, Muhammad Rizky, Muhammad Hevny Rudy Herteno Rudy Herteno Rudy Herteno SALLY LUTFIANI Saragih, Triando Hamonangan Sarah Monika Nooralifa Sa’diah, Halimatus Septyan Eka Prastya Setyo Wahyu Saputro Siti Fathmah Siti Napi'ah Tri Mulyani Ulya, Azizatul Umar Ali Ahmad Vina Maulida, Vina Wahyu Dwi Styadi Yunida, Rahmi