Cervical cancer remains one of the leading causes of cancer death in women, especially in developing countries. Early detection through screening is essential to reduce morbidity and mortality, but the main challenge is to identify individuals at high risk efficiently. This study aims to build a machine learning prediction model to classify cervical cancer biopsy results based on available risk factors. Objectives: This study aims to build a cervical cancer risk prediction model using a machine learning algorithm based on available risk factors. The public dataset "Cervical Cancer Risk Classification" includes demographic data, sexual behavior, contraceptive use, and medical test results. Three machine learning algorithms are applied: Logistic Regression, Decision Tree, and Support Vector Machine (SVM). Model evaluation uses accuracy, precision, recall, F1 score, and Matthews Correlation Coefficient (MCC). The Decision Tree model performed best with an F1 Score of 0.956 and MCC of 0.639. Significant contributing risk factors are age, age at first sexual intercourse, Schiller test results, cytology, and number of pregnancies. Machine learning has great potential to improve the effectiveness of cervical cancer screening. Data balancing techniques and ensemble methods are recommended to increase accuracy in detecting positive cases.
Copyrights © 2025