JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING
Vol. 9 No. 2 (2026): Issues January 2026

Improving Imbalanced Polycystic Ovary Syndrome Classification Using a Leakage-Free Machine Learning Pipeline

Permana, Baiq Andriska Candra (Unknown)
Zulkipli (Unknown)
Muhammad Wasil (Unknown)
Harianto (Unknown)



Article Info

Publish Date
31 Jan 2026

Abstract

Polycystic Ovarian Syndrome (PCOS) is a complex endocrine disorder affecting women of reproductive age and poses challenges for early diagnosis due to heterogeneous clinical presentations and imbalanced clinical datasets. This study aims to develop a data leakage–free machine learning pipeline to enhance the accuracy and reliability of PCOS classification using clinical data. The dataset underwent preprocessing and normalization, followed by stratified data splitting with an 80:20 ratio to maintain class proportions. The proposed pipeline was implemented within a unified computational framework integrating feature selection based on the ANOVA F-test, class imbalance handling using the Synthetic Minority Over-sampling Technique (SMOTE), and classification using a Support Vector Machine (SVM) with a Radial Basis Function (RBF) kernel. Hyperparameter tuning was performed using GridSearchCV combined with K-Fold Cross-Validation to ensure model robustness and consistency. The experimental results indicate that the proposed model achieved an accuracy of 0.9074, with precision, recall, and F1-score values of 0.8378, 0.8857, and 0.8611, respectively. Furthermore, ten dominant clinical features were identified, primarily related to hormonal profiles and ovarian morphology. These results demonstrate that the data leakage–free pipeline improves the validity and stability of PCOS prediction. The findings suggest that this approach may serve as a supportive tool for clinical decision-making, particularly in facilitating early and objective identification of PCOS.

Copyrights © 2026






Journal Info

Abbrev

jite

Publisher

Subject

Computer Science & IT Engineering

Description

JURNAL TEKNIK INFORMATIKA, JITE (Journal of Informatics and Telecommunication Engineering) is a journal that contains articles / publications and research results of scientific work related to the field of science of Informatics Engineering such as Software Engineering, Database, Data Mining, ...