Heart disease is one of the leading causes of death in the world, so early detection is an important aspect in prevention efforts. This study aims to build a heart disease risk prediction model based on patient clinical data using the Random Forest algorithm. The dataset used consists of 303 data with 13 features such as blood pressure, cholesterol, maximum heart rate, and others, as well as one nested target attribute. The data processing process includes cleaning invalid values such as question marks ('?') which are changed to missing values, and deleting incomplete data to maintain the integrity of the dataset. After going through data exploration and correlation analysis between features, the model is trained using the Random Forest algorithm because of its ability in multiclass classification and resistance to overfitting. The initial evaluation results show that the model has good prediction accuracy with a score reaching 0.89. This study proves that the Random Forest-based machine learning approach is effective in helping the process of systematically identifying heart disease risks, so it has the potential to be a decision support tool in the field of preventive health.
Copyrights © 2025