Heart disease is one of the leading causes of death worldwide, including in Indonesia. Early detection of heart disease risk is crucial to prevent more severe complications and improve patients' quality of life. This study aims to apply the Logistic Regression algorithm to build a data-driven heart disease prediction model. The dataset used is from Kaggle, with 1,025 patient data and 14 attributes covering risk factors such as age, gender, blood pressure, cholesterol, maximum heart rate, and others. The research process was conducted using the CRISP-DM approach, which includes business understanding, data exploration, preprocessing, modeling, evaluation, and model testing. The preprocessing stage includes data cleaning, encoding categorical variables, standardizing numeric data, and dividing the data into training and test data. The model was developed using the Python programming language and the scikit-learn library, then evaluated using metrics such as accuracy, precision, recall, F1-score, confusion matrix, and ROC-AUC. The evaluation results showed that the Logistic Regression model was able to provide good prediction results, with an accuracy of 0.93, a precision of 0.93, a recall of 0.96, and an F1-score of 0.95. With this performance, this model can be used as a tool for medical personnel in early detection of heart disease risk and supporting more effective and efficient decision-making.
Copyrights © 2025