Heart disease is the leading cause of death globally and is often not detected early due to limited awareness and the high cost of medical diagnosis. This study aims to develop an accurate and efficient prediction model for heart disease using the Linear Discriminant Analysis (LDA) algorithm. The dataset, obtained from Kaggle, contains 1,024 patient records with 14 clinical attributes, including age, blood pressure, cholesterol, and ECG results. The preprocessing steps include handling outliers, duplicates, class imbalance using SMOTE, and feature standardization. The model was evaluated using cross-validation and learning curve analysis. Results show that the optimized LDA model, tuned with GridSearchCV, achieved an accuracy of 82.54%, a recall of 88.91%, a precision of 79.03%, and an F1-score of 83.54%. The model demonstrates balanced and stable performance, although some misclassification in the positive class remains. This study highlights LDA as a promising method for the early detection of heart disease based on structured clinical data.
Copyrights © 2025