Journal of Applied Data Sciences
Vol 5, No 4: DECEMBER 2024

Diagnosing Cardiovascular Diseases using Optimized Machine Learning Algorithms with GridSearchCV

Alemerien, Khalid (Unknown)
Alsarayreh, Saleel (Unknown)
Altarawneh, Enshirah (Unknown)



Article Info

Publish Date
15 Oct 2024

Abstract

Accurate and timely diseases diagnosis is the most important responsibility in the healthcare industry for protecting the people lives. Many lives can be spared from death if their cases diagnosed accurately and early. One of the dangerous diseases is cardiovascular disease (CVD), is the leading cause of death worldwide, making it one of the hardest conditions to diagnose. Globally, about 17.9 million of people are died because of the cardiovascular disease. In order to assist physicians in this mission, automated solutions based on machine learning and deep learning techniques are introduced. Therefore, machine learning algorithms can diagnose diseases quickly and accurately, which adds a huge value to the medical industry. This gives physicians and patients plenty of time. To address this issue, we utilized several supervised machine learning (ML) techniques with GridSearchCV optimizer. Using the optimization techniques can enhance the performance and accuracy of proposed ML-based models. Therefore, we conducted a comparative analysis study to identify the most efficient classification model using two benchmark real datasets from the online Kaggle repository. Seven popular machine learning techniques were utilized: Decision Tree (DT), Support Vector Machine (SVM), Logistic regression (LR), K-Nearest Neighbor (KNN), Random Forest (RF), XGBoost and Naïve Bayes (NB). The findings revealed that both Random Forest and XGBoost classifiers yields highest results in both of the datasets used in our study in terms of accuracy 95.38% and 98.54%, respectively. The rest of ML algorithms showed less performance in predicting the CVD in terms of accuracy, where DT and RF achieved an accuracy of 98.53% and 98.52%, respectively, on the first dataset. Furthermore, employing the proposed ML-based model in the diagnosing CVD process shows the expected implications for patients and physicians. In addition, it shows the impact of constructing a real comprehensive dataset to enhance the performance of proposed solutions.

Copyrights © 2024






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...