Polycystic Ovary Syndrome (PCOS) is a complex hormonal disorder that affects women's reproductive and metabolic health. Early detection is essential to prevent long-term complications. This study aims to analyze and compare the performance of four machine learning classification algorithms, namely Naive Bayes, K-Nearest Neighbor (KNN), Decision Tree, and Support Vector Machine (SVM), in assisting the diagnosis of PCOS based on clinical data. The dataset used contains 1000 patient data with five main features: age, body mass index (BMI), menstrual irregularities, testosterone levels, and antral follicle count. The data were divided using stratified sampling (80:20) and validated using the k-fold cross-validation technique (k=5). Model evaluation used accuracy, precision, recall, F1-score, and AUC metrics. The results showed that Decision Tree had the best performance (100% accuracy, AUC 0.997), followed by SVM (97% accuracy) and KNN (96%). Naive Bayes had the lowest accuracy (72%) and produced many false positives. Although Decision Tree is superior, there is a risk of overfitting, while SVM and KNN show more stable performance. This study confirms that classification algorithms, especially SVM and KNN, are effective for PCOS diagnosis based on clinical data. The practical implication of this finding is the development of accurate and efficient clinical decision support systems to improve women's healthcare.
Copyrights © 2025