Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
Vol 12, No 2: June 2024

Classification of Cardiovascular Disease Based on Lifestyle Using Random Forest and Logistic Regression Methods

Bietrosula, Ajyan Brava (Program study of Information System, Faculty of Science and Technology, Universititas Airlangga, Indonesia)
Werdiningsih, Indah (Program study of Information System, Faculty of Science and Technology, Universititas Airlangga, Indonesia)
Wuriyanto, Eto (Program study of Information System, Faculty of Science and Technology, Universititas Airlangga, Indonesia)



Article Info

Publish Date
30 Jun 2024

Abstract

Cardiovascular disease is a non-communicable disease caused by a disturbance in the function of the heart or blood vessels. According to WHO country profile data released in 2018 regarding non-communicable diseases, cardiovascular disease is the highest cause of death in Indonesia. This study aims to classify cardiovascular disease based on lifestyle using the Random Forest and Logistic Regression methods. In the classification process with the Random Forest and Logistic Regression machine learning methods, a combination of parameters from each machine learning method will be tested to see which parameter combination is the best for processing and classifying cardiovascular disease datasets. The dataset used in this research is obtained from Kaggle called Cardiovascular Disease. The dataset was processed through several pre-processing stages, namely missing value imputation, outlier detection, and extreme data checking. After going through the preprocessing process, the amount of data that entered the classification process was 62478 rows of data with 13 attributes or columns, namely age, height, weight, gender, systolic blood pressure, diastolic blood pressure, cholesterol, glucose, smoking, alcohol intake, physical activity, and cardiovascular disease. Dividing the dataset into different percentage distributions of training data and testing data was also tested to see the difference in classification performance of the two methods. The division of training data was 90% and testing data is 10%. The results obtained from this study were the Logistic Regression method had better accuracy results of 73.07% compared to Random Forest with an accuracy result of 71.87%.

Copyrights © 2024






Journal Info

Abbrev

IJEEI

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

Indonesian Journal of Electrical Engineering and Informatics (IJEEI) is a peer reviewed International Journal in English published four issues per year (March, June, September and December). The aim of Indonesian Journal of Electrical Engineering and Informatics (IJEEI) is to publish high-quality ...