Journal of Dinda : Data Science, Information Technology, and Data Analytics
Vol 5 No 2 (2025): August

Implementation of Random Forest Algorithm with RFE and SMOTE on Cardiotocography Dataset

Nur Taqwimi, Muhammad Ahsani (Unknown)
Wahono, Buang Budi (Unknown)
Mulyo, Harminto (Unknown)



Article Info

Publish Date
05 Aug 2025

Abstract

Having a healthy baby is a dream for mothers. However, the high rate of maternal and fetal mortality is still a serious problem, so more accurate fetal health monitoring is needed to prevent pregnancy complications. One of the devices used is Cardiotocography (CTG), which produces data on fetal conditions. The CTG dataset used in this study faces challenges in the form of class imbalance and a high number of features, which can reduce classification performance. This study aims to overcome these challenges by implementing the Random Forest algorithm combined with the Synthetic Minority Oversampling Technique (SMOTE) technique for class balancing and Recursive Feature Elimination (RFE) for feature selection. The dataset used is "Fetal Health Classification" from the Kaggle platform, which consists of 2,126 data with three classes: Normal, Suspect, and Pathological. The test results show that the RFE method is able to reduce the number of features from 22 to 18, while SMOTE increases the proportion of minority data. The model built produces good classification performance with an accuracy value of 95%, precision 93%, recall 89%, and F1-score 91%. The ROC-AUC value for the Normal class is 0.9881, Suspect 0.9789, and Pathological 0.9985. Although the model is able to predict the Normal and Pathological classes with high accuracy, the performance on the Suspect class still needs to be improved. Overall, the integration of Random Forest with SMOTE and RFE has proven effective in improving the accuracy of fetal health classification.

Copyrights © 2025






Journal Info

Abbrev

dinda

Publisher

Subject

Computer Science & IT

Description

Journal of Dinda : Data Science, Information Technology, and Data Analytics as a publication media for research results in the fields of Data Science, Information Technology, and Data Analytics, but not implicitly limited. Published 2 times a year in February and August. The journal is managed by ...