Claim Missing Document
Check
Articles

Implementation of Random Forest and Extreme Gradient Boosting in the Classification of Heart Disease using Particle Swarm Optimization Feature Selection Ansyari, Muhammad Ridho; Mazdadi, Muhammad Itqan; Indriani, Fatma; Kartini, Dwi; Saragih, Triando Hamonangan
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 5 No 4 (2023): October
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v5i4.322

Abstract

Heart disease is a condition that ranks as the primary cause of death worldwide. Based on available data, over 36 million people have succumbed to non-communicable diseases, and heart disease falls within the category of non-communicable diseases. This research employs a heart disease dataset from the UCI Repository, consisting of 303 instances and 14 categorical features. In this research, the data were analyzed using the classification methods XGBoost (Extreme Gradient Boosting) and Random Forest, which can be applied with PSO (Particle Swarm Optimization) as a feature selection technique to address the issue of irrelevant features. This issue can impact prediction performance on the heart disease dataset. From the results of the conducted research, the obtained values for the XGBoost (Extreme Gradient Boosting) model were 0.877, and for the Random Forest model, it was 0.874. On the other hand, in the model utilizing Particle Swarm Optimization (PSO), the obtained AUC values are 0.913 for XGBoost (Extreme Gradient Boosting) and 0.918 for Random Forest. These research results demonstrate that PSO (Particle Swarm Optimization) can enhance the AUC of heart disease prediction performance. Therefore, this research contributes to enhancing the precision and efficiency of heart disease patient data processing, which benefits heart disease diagnosis in terms of speed and accuracy.
Application Of SMOTE To Address Class Imbalance In Diabetes Disease Classification Utilizing C5.0, Random Forest, And SVM M. Khairul Rezki; Mazdadi, Muhammad Itqan; Indriani, Fatma; Muliadi, Muliadi; Saragih, Triando Hamonangan; Athavale, Vijay Annant
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 4 (2024): October
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i4.434

Abstract

The implementation of SMOTE to tackle class imbalance in classification frequently results in suboptimal outcomes, owing to the intricacy of the dataset and the multitude of attributes at play. Consequently, alternative classification models were explored through experimentation to gauge their precision. This research aims to compare the precision of C5.0, Random Forest, and SVM classification models both with and without SMOTE. The methodology encompasses dataset selection, an overview of classification algorithms (C5.0, Random Forest, SVM), SMOTE technique, validation via split validation, preprocessing involving min-max normalization, and execution evaluation utilizing confusion matrices and AUC analysis. The dataset was sourced by Kaggle, specifically to rectify class imbalance in a diabetes dataset using SMOTE, consisting of 768 instances, with 268 samples for diabetic cases and 500 samples for non-diabetic cases. Prior to SMOTE application, the classification precision for C5.0, Random Forest, and SVM were 0.714, 0.733, and 0.746 respectively, with corresponding AUC values of 0.745, 0.824, and 0.799. Post-SMOTE, the precision depicts for the same techniques were 0.603, 0.727, and 0.727, with AUC values of 0.734, 0.831, and 0.794 respectively. It can be inferred that there's minimal impact post-SMOTE across the three classification models due to potential overfitting on the dataset, leading to excessive reliance on synthesized data for minority classes, resulting in diminished model execution, precision, and AUC scores.
Implementation of Extreme Learning Machine Method with Particle Swarm Optimization to Classify of Chronic Kidney Disease Muhammad Mursyidan Amini; Mazdadi, Muhammad Itqan; Muliadi, Muliadi; Faisal, Mohammad Reza; Saragih, Triando Hamonangan
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 4 (2024): October
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i4.561

Abstract

Kidney Disease (CKD) appears as a pathological condition due to infection of the kidneys and blockages due to the formation of kidney stones. In the Indonesian context, kidney disease is the second most common disease after heart disease based on BPJS Health data. Notably, in this scenario, medical practitioners and individuals with specialized knowledge in the field are still faced with challenges in effectively classifying CKD cases, thereby making them vulnerable to erroneous diagnostic conclusions. The main objective underlying this particular research effort revolves around increasing the level of accuracy that characterizes the CKD classification process by orchestrating the incorporation of Particle Swarm Optimization (PSO) techniques into the operational framework of Extreme Learning Machines (ELM) with the aim of ensuring optimal results. Configuration of input weights and critical biases to achieve superior diagnostic results. The results obtained from the investigation process include many numerical parameters including but not limited to determining the ideal number of hidden nodes set at 11, population size 80, identification of the most preferred number of iterations denoted by the Best value of 20, aggregate inertia weight assessed at 0.5, along with the constants 1 (c1) and 2 (c2) each registering a value of 1, culminating in the achievement of an accuracy metric pegged at an impressive level of 98.50%. Consequently, the implications obtained from this empirical investigation strengthen the assertion that the use of PSO optimization strategies within the operational framework of ELM has the potential to yield major advances in the classification evaluation domain related to CKD diagnosis.
The Impactness of SMOTE as Imbalance Class Handling for Myocardial Infarction Complication Classification using Machine Learning Approach with Data Imputation and Hyperparameter Ahmad Tajali; Saragih, Triando Hamonangan; Mazdadi, Muhammad Itqan; Budiman, Irwan; Farmadi, Andi
Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 6 No. 4 (2024): November
Publisher : Jurusan Teknik Elektromedik, Politeknik Kesehatan Kemenkes Surabaya, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/ijeeemi.v6i4.13

Abstract

Myocardial Infarction (MI) is a critical medical emergency characterized by the sudden blockage of blood flow to the heart muscle, often resulting from a blood clot in a coronary artery that has been narrowed by atherosclerotic plaque buildup. This condition demands immediate attention, as prolonged disruption of blood supply can cause irreversible damage to the heart muscle. Diagnosing MI typically involves a combination of methods, including a physical examination, electrocardiogram (ECG) analysis, blood tests to measure heart-specific enzymes, and imaging techniques such as coronary angiography. Early prediction of potential MI complications is crucial to prevent severe outcomes and improve patient prognosis. This study focuses on the early prediction of MI complications through the application of machine learning classification methods. We employed algorithms such as Support Vector Machine (SVM), Random Forest, and XGBoost to analyze patient medical records and accurately predict these complications. The selection of Support Vector Machine (SVM), Random Forest, and XGBoost in this study is driven by their proven effectiveness in handling complex classification problems. To manage incomplete datasets and preserve valuable information, data imputation techniques like K-Nearest Neighbors (KNN) Imputation, Iterative Imputation, and MissForest were applied.  KNN, Iterative, and MissForest imputations were chosen to handle missing data due to their effectiveness in preserving data integrity, which is crucial for accurate predictions in myocardial infarction complication studies. Additionally, Bayesian Optimization was utilized to fine-tune the hyperparameters of the models, thereby enhancing their predictive accuracy. The Iterative Imputation method yielded the best performance, particularly in SVM and XGBoost algorithms. SVM achieved 100% accuracy, precision, sensitivity, F1 score, and Area Under the Curve (AUC), while XGBoost attained 99.4% accuracy, 100% precision, 79.6% sensitivity, an F1 score of 88.7%, and an AUC of 0.898. While XGBoost and MissForest proved to be the most successful pairing, the overall effectiveness of the models suggests that Iterative Imputation and Random Forest also have potential under certain conditions.
Applying XGBoost-ADASYN in the Classification Process of Bank Customers Who Will Take Time Deposits Abdilah, Muhammad Fariz Fata; Mazdadi, Muhammad Itqan; Farmadi, Andi; Muliadi, Muliadi; Indriani, Fatma; Rozaq, Hasri Akbar Awal; Yıldız, Oktay
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.551

Abstract

Investment in the form of time deposits at banks offers stable returns. Identifying and attracting potential customers, however, poses challenges. This research enhances the predictive capabilities of deposit classification models by addressing data imbalance with a combination of XGBoost, ADASYN, and Random Search optimization techniques. The integration of ADASYN improves minority class representation, while Random Search efficiently optimizes model parameters. Our findings show a significant accuracy of 94.93%, benchmarked against baseline models, highlighting our method's effectiveness compared to traditional approaches. This hybrid model advances customer data analysis and achieves our research objectives. We discuss the integration challenges, including computational demands and technique selection. The research underscores the application of machine learning to address financial industry issues, emphasizing the impact of data preprocessing and feature engineering on performance. Future studies might explore AutoML to reduce complexity further and enhance model scalability, promising more innovation in customer data analysis.
Pengembangan Sistem Manajemen Sarana Dan Prasarana, IT, Serta Laboratorium Di SMK Telekomunikasi Putri Nabella; Rudy Herteno; Setyo Wahyu Saputro; Friska Abadi; Muhammad Itqan Mazdadi; Nabella, Putri
Jurnal Teknologi Informasi dan Ilmu Komputer Vol 12 No 1: Februari 2025
Publisher : Fakultas Ilmu Komputer, Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25126/jtiik.2025128649

Abstract

Bidang Sarana dan Prasarana, IT, serta Laboratorium di SMK Telekomunikasi menghadapi tantangan dalam pengelolaan data yang tersebar di berbagai file Microsoft Excel, menyebabkan kesulitan dalam pengumpulan laporan untuk audit dan sertifikasi. Penelitian ini bertujuan mengembangkan sistem manajemen terpadu menggunakan framework CodeIgniter 4, PHP, dan MySQL dengan metode Rational Unified Process (RUP) dan desain Unified Modelling Language (UML). Sistem ini dirancang untuk menyelaraskan pengelolaan data dan memfasilitasi penyajian informasi yang efisien. Hasil pengujian black box menunjukkan tingkat keberhasilan 100%, sementara user acceptance testing memperoleh skor 92% dengan predikat sangat baik. Implementasi sistem ini diharapkan meningkatkan efisiensi dan efektivitas manajemen sarana, prasarana, IT, dan laboratorium di SMK Telekomunikasi, memberikan kontribusi signifikan terhadap peningkatan kualitas pengelolaan dan kepuasan pengguna.   Abstract. The Facilities and Infrastructure, IT, and Laboratory Department at SMK Telekomunikasi faces challenges in managing data scattered across various Microsoft Excel files, resulting in difficulties in compiling reports for audits and certifications. This research aims to develop an integrated management system using the CodeIgniter 4 framework, PHP, and MySQL, employing the Rational Unified Process (RUP) methodology and Unified Modelling Language (UML) design. This system is designed to streamline data management and facilitate efficient information presentation. The results of the black box testing showed a success rate of 100%, while the user acceptance testing scored 92% with an excellent rating. The implementation of this system is expected to enhance the efficiency and effectiveness of managing facilities, infrastructure, IT, and laboratories at SMK Telekomunikasi, significantly contributing to improved management quality and user satisfaction.
The Enhancing Diabetes Prediction Accuracy Using Random Forest and XGBoost with PSO and GA-Based Feature Selection Dzira Naufia Jawza; Mazdadi, Muhammad Itqan; Farmadi, Andi; Saragih, Triando Hamonangan; Kartini, Dwi; Abdullayev, Vugar
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 7 No 2 (2025): April
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v7i2.626

Abstract

Diabetes represents a global health concern classified as a non-communicable disease, impacting more than 422 million people worldwide, with the number expected to increase each year. This study aims to evaluate the performance of the Random Forest and Extreme Gradient Boosting (XGBoost) classification algorithms on the diabetes disease dataset taken from Kaggle. To improve prediction accuracy, feature selection was carried out using Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) which are expected to filter the most relevant features. The study results showed that the Random Forest model without feature selection yielded an Area Under Curve (AUC) value of 0.8120, while XGBoost achieved an AUC of 0.7666. After applying feature selection with PSO, the AUC increased to 0.8582 for Random Forest and 0.8250 for XGBoost. The use of feature selection with GA gave better results, with an AUC of 0.8612 for Random Forest and 0.8351 for XGBoost. These results indicate that the increase in accuracy after feature selection using PSO ranges from 5.7% to 7.6%, while the increase with GA ranges from 6.1% to 8.9%, with GA providing more significant results. This study contributes to improving the accuracy of diabetes disease classification, which is expected to support the diagnosis process more quickly and accurately.
Prediksi Churn Pelanggan Telekomunikasi dengan Optimalisasi Seleksi Fitur dan Tuning Hyperparameter pada Algoritma Klasifikasi C4.5 Antoh, Soterio; Herteno, Rudy; Budiman, Irwan; Kartini, Dwi; Mazdadi, Muhammad Itqan
Jurnal Sistem Informasi Bisnis Vol 15, No 1 (2025): Volume 15 Number 1 Year 2025
Publisher : Diponegoro University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/vol15iss1pp60-67

Abstract

In the telecommunications industry, predicting customer churn is crucial for maintaining business sustainability. High churn rates can negatively impact profitability, necessitating effective retention strategies. This research aims to enhance the accuracy of telecommunications customer churn prediction by optimizing the C4.5 classification algorithm through feature selection and hyperparameter tuning. The methods used include Information Gain for feature selection and hyperparameter tuning with Random Search and Grid Search. This study utilizes the Telco Customer Churn dataset from Kaggle, split into an 80:20 ratio for training and testing data. Six approaches are applied: (1) the basic C4.5 algorithm, (2) C4.5 with Information Gain, (3) C4.5 with Random Search, (4) C4.5 with Grid Search, (5) C4.5 with a combination of Information Gain and Random Search, and (6) C4.5 with a combination of Information Gain and Grid Search. The results indicate that the C4.5 algorithm alone achieves an accuracy of 74.09%, while applying Information Gain increases accuracy to 78.42%. Hyperparameter tuning with Random Search achieves the highest accuracy of 80.05%, whereas Grid Search reaches 77.71%. Combining Information Gain with Random Search results in an accuracy of 78.99%, while combining Information Gain with Grid Search yields an accuracy of 78.85%. These findings suggest that hyperparameter tuning using Random Search significantly improves accuracy compared to other methods, while Information Gain feature selection does not have a significant impact on performance in this context.
Implementation of PPCA Imputation, SMOTE-N Class Balancing in Hepatitis Classification Using Naïve Bayes Fathmah, Siti; Kartini, Dwi; Abadi, Friska; Budiman, Irwan; Mazdadi, Muhammad Itqan
JUITA: Jurnal Informatika JUITA Vol. 12 No. 2, November 2024
Publisher : Department of Informatics Engineering, Universitas Muhammadiyah Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30595/juita.v12i2.21528

Abstract

The availability of complete data in research is crucial, especially in the initial stages. The Hepatitis data used in this study encountered issues such as missing data and class imbalance, which hindered its optimal utilization. The method employed to address missing data was the PPCA imputation method. After filling in the missing data, the data was balanced using the SMOTE-N class balancing method and classified using Gaussian Naïve Bayes. The aim of this research was to compare the classification evaluation of hepatitis disease using Naive Bayes with the PPCA imputation approach and SMOTE-N class balancing. The best results from each scenario yielded an AUC value of 0.833 in the first scenario with an 80:20 data split for training and testing, and 0.875 in the second scenario with a 90:10 data split. The highest AUC value was obtained in the application of PPCA imputation with SMOTE-N class balancing using Naive Bayes classification. This demonstrates that the implementation of PPCA imputation with SMOTE-N class balancing has a better impact on the performance of Naïve Bayes classification.
A Cost-Effective Vital Sign Monitoring System Harnessing Smartwatch for Home Care Patients Dodon Turianto Nugrahadi; Rudy Herteno; Mohammad Reza Faisal; Nursyifa Azizah; Friska Abadi; Irwan Budiman; Muhammad Itqan Mazdadi
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 7 No 6 (2023): December 2023
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v7i6.5126

Abstract

Pap smear is a digital image generated from the recording of cervical cancer cell preparation. Images generated are susceptible to errors due to relatively small cell sizes and overlapping cell nuclei. Therefore, an accurate analysis of the Pap smear image is essential to obtain the right information. This research compares nucleus segmentation and detection using gray-level cooccurrence matrix (GLCM) features in two methods: Otsu and polynomial. The data tested consisted of 400 images sourced from RepoMedUNM, a publicly accessible repository containing 2,346 images. Both methods were compared and evaluated to obtain the most accurate characteristics. The research results showed that the average distance of the Otsu method was 6.6457, which was superior to the polynomial method with a value of 6.6215. Distance refers to the distance between the nucleus detected by the Otsu and the Polynomial method. Distance is an important measure to assess how closely the detection results align with the actual nucleus positions. It indicates that the polynomial method produces nucleus detections that are on average closer to the actual nucleus positions compared to the Otsu method. Consequently, this research can serve as a reference for future studies in developing new methods to enhance identification accuracy.
Co-Authors AA Sudharmawan, AA Abdilah, Muhammad Fariz Fata Abdullayev, Vugar Ade Agung Harnawan, Ade Agung Adela Putri Ariyanti Afifa, Ridha Ahdyani, Annisa Salsabila Ahmad Rusadi Ahmad Rusadi Ahmad Rusadi Arrahimi - Universitas Lambung Mangkurat) Ahmad Rusadi Arrahimi - Universitas Lambung Mangkurat) Ahmad Shofi Khairian Ahmad Tajali Aidil Akbar Al Ghifari, Muhammad Akmal Alamudin, Muhammad Faiq Amalia, Raisa Andi - Farmadi Andi Farmadi Andi Farmadi Anna Khumaira Sari Anshory, Muhammad Naufal Ansyari, Muhammad Ridho Antoh, Soterio Ardiansyah Sukma Wijaya Athavale, Vijay Anant Athavale, Vijay Annant budiman, irwan Buih, Putri Helena Junjung Deni Sutaji Dina Arifah Djordi Hadibaya Dodon Turianto Nugrahadi Dwi Kartini Dwi Kartini Dwi Kartini, Dwi Dzira Naufia Jawza Erdi, Muhammad Faisal, Mohammad Reza Fathmah, Siti Fatma Indriani Fayyadh, Muhammad Naufaldi Fitriani, Karlina Elreine Fitrinadi Friska Abadi Haekal, Muhammad Hafizah, Rini Helma Herlinda Herteno, Rudi Herteno, Rudy Indriani, Fatma Irwan Budiman Irwan Budiman Irwan Budiman Irwan Budiman M. Apriannur M. Khairul Rezki Mafazy, Muhammad Meftah Maulana, Muhammad Rafly Alfarizqy Muflih Ihza Rifatama Muhamad Fawwaz Akbar Muhamad Ihsanul Qamil Muhammad Adika Riswanda Muhammad Khairin Nahwan Muhammad Mada Muhammad Mirza Hafiz Yudianto Muhammad Mursyidan Amini Muhammad Reza Faisal, Muhammad Reza Muliadi Muliadi Muliadi Muliadi Muliadi Muliadi Muliadi Muliadi Muliadi Muliadi Muliadi Muliadi Nabella, Putri Noorhafizi, Muhammad Normaidah, Normaidah Nugraha, Muhammad Amir Nursyifa Azizah P., Chandrasekaran Patrick Ringkuangan Prastya, Septyan Eka Putri Nabella Radityo Adi Nugroho Rahmah, Indah Noor Rahmat Hidayat Rahmat Ramadhani Rahmat Ramadhani Rahmawati, Nanda Hesti Rahmawati, Nanda Putri Ramadhan, Mita Azzahra Ramadhani, Muhammad Irfan Ramadhani, Rahmat Ratnapuri, Prima Happy Riadi, Agus Teguh Rifki Izdihar Oktvian Abas Pullah Rifki Rinaldi Rizky, Muhammad Miftahur Rozaq, Hasri Akbar Awal Rozaq, Hasri Awal Akbar Rudy Herteno Saputra, Adryan Maulana Saputro, Setyo Wahyu Saragih, Triando Hamonangan Satrio Yudho Prakoso Setyo Wahyu Saputro Shalehah Syahputra, Muhammad Reza Tajali, Ahmad Totok Wianto Wahyu Dwi Styadi Wijaya Kusuma, Arizha Yanche Kurniawan Mangalik YILDIZ, Oktay Yoga Pambudi Yudha Sulistiyo Wibowo Zaini Abdan