Maternal mortality is still a major health issue worldwide, and it, along with other reasons, has been leading to predictions that need risk-assessing systems to be improved. The current study performed the sequential outlier detection combining Interquartile Range followed by Local Outlier Factor methods on six machine learning algorithms using the UCI Maternal Health Risk dataset. The comprehensive preprocessing pipeline included the removal of duplicates, application of SMOTE for balancing, followed by Min-Max normalization and detection of outliers in a sequence. The performance of the model was evaluated through holdout validation and 10-fold cross-validation with statistical validation through Wilcoxon signed-rank tests and Cohen's d effect sizes. The Extra Trees Classifier resulted in a 98.34% accuracy rate, which is higher than that in previous studies. The distance-based methods showed the highest sensitivity, with KNN gaining 8.35% while tree-based ensembles were consistent with the accuracy gains. The statistical validation proved that there was a great extent of practical significance with a large effect size of more than 1.0 for the top performers, thereby establishing evidence-based guidelines for the application of sequential preprocessing in maternal health risk prediction systems.
Copyrights © 2026