Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Journal of Applied Data Sciences

The Development of Stacking Techniques in Machine Learning for Breast Cancer Detection Van FC, Lucky Lhaura; Anam, M. Khairul; Bukhori, Saiful; Mahamad, Abd Kadir; Saon, Sharifah; Nyoto, Rebecca La Volla
Journal of Applied Data Sciences Vol 6, No 1: JANUARY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i1.416

Abstract

This study addresses the challenges of accurately detecting breast cancer using machine learning (ML) models, particularly when handling imbalanced datasets that often cause model bias toward the majority class. To tackle this, the Synthetic Minority Over-sampling Technique (SMOTE) was applied not only to balance the class distribution but also to improve the model's sensitivity in detecting malignant tumors, which are underrepresented in the dataset. SMOTE was effective in generating synthetic samples for the minority class without introducing overfitting, enhancing the model's generalization on unseen data. Additionally, AdaBoost was employed as the meta model in the stacking framework, chosen for its ability to focus on misclassified instances during training, thereby boosting the overall performance of the combined base models. The study evaluates several models and combinations, with K-Nearest Neighbors (KNN) + SMOTE achieving an accuracy of 97%, precision, recall, and F1-score of 97%. Similarly, C4.5 + Hyperparameter Tuning + SMOTE reached 95% in all metrics. The stacking model with Logistic Regression (LR) as the meta model and SMOTE achieved a strong performance with 97% accuracy, precision, recall, and F1-score all at 97%. The best result was obtained using the combination of Stacking AdaBoost + Hyperparameter Tuning + SMOTE, reaching an accuracy of 98%. These findings highlight the effectiveness of combining SMOTE with stacking techniques to develop robust predictive models for medical applications. The novelty of this study lies in the integration of SMOTE and advanced stacking methods, particularly using AdaBoost and Logistic Regression, to address the issue of class imbalance in medical datasets. Future work will explore deploying this model in clinical settings for accurate and timely breast cancer detection.
Co-Authors Ahmad Fauzal Adifia Ahmad Firdaus Ababil Ahmad Firdaus Ababil Al Munawir Anam, M Khairul Ancah Caesarina Novi Marchianti Antonius Cahya Prihandoko Basbeth, Faishal Bayhaqqi Bayhaqqi Bukhori, Hilmi Aziz Dewi Kholifatul Ummah Dewi Rokhmah Dharmawan, Tio Diah Adistia Diah Adistia A Diah Ayu Retnani Wulandari Fahruddin Arrasyid Alfansuri Faishal Basbeth Feby Indriana Yusuf Feby Sabilhul Hanafi Firman Firman Furqon, Muhammad Ariful FX Ady Soesetijo Gayatri Dwi Santika Gusfan Halik Hairul Anam Hanesya, Arini Farihatul Haryanto, Kurniawan Wahyu Hastungkara, Duhita Husnul Hotimatus s Husnul Hotimatus Sahroh I Ketut Eddy Purnama januar adi putra, januar adi Krisnha Dian Ayuningtyas Lucky Lhaura Van FC, Lucky Lhaura Luh Putu Ratna Sundari Mahamad, Abd Kadir Malik Qilsi, Fatkhur Ruli Markus Apriono Maulia Azizah Maulina Azizah Mauridhi Heri Purnomo Mochamad Hariadi Mohammad Ovi Sanjaya Mohammad Zarkasi Muhammad Bagus Rizqi Alvian Muhammad Noor Dwi Eldianto Mustika Rahmasuci Mustika Rahmasuci Nafolion Nur Rahmat, Nafolion Nur Negoro, Wahyu Saptha Nur Kholis Mansur Nuryadi Nuryadi Oktalia Juwita Oktavia, Nelly Puspitarini, Niken Wahyu Putra, Januar Adi PUTRI WULANDARI R., Windi Eka Y. Rebecca La Volla Nyoto Saon, Sharifah Sari, Meylita Shasha Nur Faadhilah Sonya Sulistyono Sri Hartatik Sri Hernawati Sri Wahyuni Sumijan Sumijan Surmayanti, Surmayanti Syaiful Anam Tio Dharmawan Umroh Makhmudah Vivi Sefrinta Izza Afkarina Wijaya, Angga Ari Wiji Utami Windi Eka Yulia Retnani Yudha Alif Aulia Yudha Alif Auliya Yudhi Tri Gunawan Yunarni, Wiwik