Introduction: Polycystic ovary syndrome (PCOS) is a complex endocrine disorder affecting women of reproductive age, with multiple diagnostic criteria currently in use including Rotterdam, NIH, and AES criteria. However, the diagnostic accuracy of these criteria remains variable across populations. This systematic review aimed to evaluate and compare the diagnostic accuracy of different clinical diagnostic criteria for identifying PCOS in reproductive-age women. Methods: A systematic review of diagnostic accuracy studies was conducted. Studies were included if they evaluated at least one of the three specified diagnostic criteria (Rotterdam, NIH, or AES) against a reference standard in women of reproductive age (15-45 years). Diagnostic accuracy measures including sensitivity, specificity, and area under the ROC curve (AUC) were extracted. The quality of included studies was assessed using appropriate diagnostic accuracy assessment tools. Results: Eighty studies encompassing diverse populations across North America, Europe, Asia, Africa, and the Middle East were included. The Rotterdam criteria demonstrated strong diagnostic utility with follicle number per ovary showing the highest accuracy (sensitivity 84%, specificity 91%, AUC 0.905). NIH criteria identified fewer women (27.1% prevalence) compared to Rotterdam (40%) and showed an AUC of 0.80 for AMH as a diagnostic marker. AES criteria yielded intermediate prevalence (29.3%) with AMH sensitivity of 84.4% and specificity of 72% (AUC 0.857). Anti-Müllerian hormone emerged as a promising biomarker with age-specific thresholds ranging from 5.7 ng/mL (20-27 years) to 3.72 ng/mL (35-40 years). Replacing PCOM with AMH in Rotterdam criteria improved diagnostic accuracy (AUC 0.934-0.97). Significant geographic and ethnic variations in optimal thresholds were observed. Discussion: The Rotterdam criteria demonstrate superior sensitivity but may overdiagnose milder phenotypes, while NIH criteria identify metabolically high-risk women with greater specificity. AES criteria provide an intermediate approach emphasizing androgen excess. Age-stratified and population-specific thresholds are essential for optimal diagnostic accuracy. AMH shows promise as an objective alternative to ultrasound assessment of PCOM. Conclusion: No single diagnostic criterion is universally optimal; the choice of criteria should be guided by clinical context, population characteristics, and available resources. Age-stratified and population-specific thresholds, particularly for AMH and ultrasound parameters, are recommended to improve diagnostic accuracy.
Copyrights © 2026