Jurnal Sistem Informasi dan Informatika (SIMIKA)
Vol. 8 No. 1 (2025): Jurnal Sistem Informasi dan Informatika (Simika)

PENGARUH KOMPOSISI SPLIT DATA PADA AKURASI KLASIFIKASI PENDERITA DIABETES MENGGUNAKAN ALGORITMA MACHINE LEARNING

Febby Refindha Aftha Harianto (Unknown)
Zakki Alawi (Unknown)
Ita Aristia Sa’ida (Unknown)



Article Info

Publish Date
07 Jan 2025

Abstract

The increasing number of people with diabetes is an international health problem. To prevent diabetic complications, early diagnosis and accurate classification are essential. This study looks at how the composition of split data affects the classification performance of diabetics with machine learning algorithms such as Random Forest, Naive Bayes, and Support Vector Machine (SVM). The research data is taken from Bojonegoro Regency Hospital, which consists of 128 samples that have 10 main features. To ensure the data is ready for use, the research method goes through a preprocessing stage. Next, the data was divided into training and testing data with a ratio of 90:10, 80:20, 70:30, 60:40, and 50:50 respectively. Using confusion matrix, the algorithm is assessed for accuracy, precision, recall, and F1 score. In this study we focus on the accuracy values obtained and the results show that the proportion of data sharing affects the performance of the algorithm. Random Forest achieved 100% accuracy in some scenarios. This algorithm also proved to be the most effective in the classification of diabetics. In conclusion, algorithm selection and data split composition are very important for model performance optimization. These results are important for the development of more accurate and efficient Machine Learning-based diagnosis systems. Further research can consider larger datasets and additional algorithms for better results.

Copyrights © 2025






Journal Info

Abbrev

jsii

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Jurnal Sistem Informasi dan Informatika aims to provide scientific literature specifically on studies of applied research in information systems (IS), information technology (IT) and public review of the development of theory, method, and applied sciences related to the ...