Polyscopia
Vol. 3 No. 2 (2026)

Analisis Pengaruh Pembagian Data terhadap Kinerja Algoritma Naive Bayes dalam Prediksi Penyakit Diabetes pada Wanita

Amelia, Rika (Unknown)



Article Info

Publish Date
30 Apr 2026

Abstract

This study examines how variations in training–testing data partition ratios influence the performance of the Naive Bayes algorithm in predicting diabetes among women, addressing the problem of whether different split proportions meaningfully affect classification outcomes. Employing a quantitative experimental design, the research utilizes the Pima Indians Diabetes dataset comprising 768 records, which undergo preprocessing prior to model development using the Gaussian Naive Bayes method across three partition scenarios: 70:30, 60:40, and 50:50. Model performance is assessed through accuracy, precision, recall, and F1-score to capture both predictive correctness and class sensitivity. The findings demonstrate that variations in data partitioning exert no statistically significant effect on overall model performance, as accuracy consistently ranges between 76% and 79% across all scenarios. Models trained with as little as 50% of the dataset still achieve comparable predictive capability, indicating stable generalization of the algorithm. The study argues that once a minimum threshold of training data is achieved, increasing data proportion does not substantially enhance performance, while class imbalance emerges as a more decisive factor influencing the effectiveness of diabetes prediction.

Copyrights © 2026






Journal Info

Abbrev

polyscopia

Publisher

Subject

Religion Arts Humanities Social Sciences Other

Description

Polyscopia is an open-access journal by Medan Resource Center. The journal publishes research articles from multidisciplinary and various types, methods, or approaches of research in education, applied sciences, natural or social sciences, philosophy, economics, law, politics, religions, as well as ...