Garuda - Garba Rujukan Digital

Journal of Applied Data Sciences

Vol 5, No 4: DECEMBER 2024

Abu-Shareha, Ahmad Adel (Unknown)
Qutaishat, Haneen (Unknown)
Al-Khayat, Asma (Unknown)

Publish Date
15 Oct 2024

People with diabetes are at an increased risk of developing other complications, such as heart disease and nerve damage. Therefore, diabetes prediction is crucial to reduce the severe consequences of this disease. This study proposed a comprehensive framework for diabetes prediction to maximize the information from available diabetes datasets, which include historical records, laboratory tests, and demographic data. The proposed framework implements a data imputation technique for filling in missing values and adopts feature selection methods to remove less important features for better diabetes classification. An oversampling technique and a parameter tuning approach were used to increase the samples and fine-tune the parameters for training the machine learning algorithms. Various machine learning algorithms, including Neural Networks, Logistic Regression, Support Vector Machines, and Random Forest, were used for the prediction. These algorithms were evaluated using both train-test split and cross-validation techniques. The experiments were conducted on the Pima Indian Diabetes dataset using various evaluation metrics, including accuracy, precision, recall, and F-measure. The results showed that the Random Forest algorithm, particularly when fine-tuned with Grid Search Cross Validation, outperformed other algorithms, achieving an impressive accuracy of 0.99. This demonstrates the robustness and effectiveness of the proposed framework, which outperformed the accuracy of state-of-the-art approaches.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Journal of Applied Data Sciences

Website

Abbrev

JADS

Publisher

Bright Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...

Article Info

Abstract

A Framework for Diabetes Detection Using Machine Learning and Data Preprocessing

Article Info

Abstract