Jurnal: International Journal of Engineering and Computer Science Applications (IJECSA)
Vol. 5 No. 1 (2026): March 2026 (In Press)

Analysis of Preprocessing Technique Combinations and Hyperparameter Tuning for Building a Reliable Random Forest–Based Stroke Prediction Model

Ristyawan, Aidina (Unknown)
Nugroho, Arie (Unknown)



Article Info

Publish Date
02 Mar 2026

Abstract

Stroke is a major health threat that can result in permanent disability or death, yet its risks can be mitigated through accurate early detection. Although the Random Forest algorithm is frequently utilized for stroke prediction, prior studies have often neglected model reliability, specifically the stability of performance between training and testing phases. This research aims to develop a dependable stroke prediction model by implementing the CRISP-DM methodology on a public dataset comprising 5,110 data points. The proposed methodology involves a comprehensive evaluation of 48 preprocessing technique combinations—addressing missing values in the BMI attribute, categorical transformation, feature scaling, and class balancing—followed by a two-stage hyperparameter optimization strategy: Randomized Search for broad exploration and Grid Search Refine for local refinement to ensure optimal stability. Model performance was evaluated using accuracy, precision, recall, and F1-score metrics. The results demonstrate that hyperparameter tuning successfully enhanced model performance by up to 38.80%. Additionally, it was found that the hybrid balancing technique (SMOTETomek) did not consistently yield the most stable models in this specific case. The optimal model (Model No. 8) achieved a training accuracy of 0.925 and a testing accuracy of 0.877. With a minimal performance gap of 0.047 (below the 0.05 threshold), this model is classified as "good fitting," signifying superior generalization capabilities. Consequently, this model is highly recommended for implementation as a robust and trustworthy early warning decision support system for medical professionals.

Copyrights © 2026






Journal Info

Abbrev

IJECSA

Publisher

Subject

Computer Science & IT

Description

Description of Journal : The International Journal of Engineering and Computer Science Applications (IJECSA) is a scientific journal that was born as a forum to facilitate scientists, especially in the field of computer science, to publish their research papers. The 12th of the 12th month of 2021 is ...