Bulletin of Informatics and Data Science
Vol 4, No 1 (2025): May 2025

Diabetes Classification using Gain Ratio Feature Selection in Support Vector Machine Method

Al Rasyid, Nabila (Unknown)
Afrianty, Iis (Unknown)
Budianita, Elvia (Unknown)
Kurnia Gusti, Siska (Unknown)



Article Info

Publish Date
31 May 2025

Abstract

Diabetes is a major cause of many chronic diseases such as visual impairment, stroke and kidney failure. Early detection especially in groups that have a high risk of developing diabetes needs to be done to prevent problems that have a wide impact. Indonesia is ranked seventh in the world with a prevalence of 10.7% of the total number of people with diabetes. This research aims to determine the attributes in the diabetes dataset that most affect the classification and apply the Support Vector Machine method for diabetes classification. For the determination process, Gain Ratio feature selection technique is applied. The dataset used consists of 768 data with 8 attributes. In this classification process, 3 SVM kernels (Linear, Polynomial, and RBF) are used with three possible data divisions using the ratio (70:30; 80:20; 90:10). Before applying feature selection, there were 8 attributes used and achieved the highest accuracy of 94.81% at a ratio of 80:20 using the RBF kernel with a combination of two parameters namely C = 100, Gamma = 3 and C = 100, Gamma = Scale.  Feature selection parameters in the form of thresholds used include 0.02; 0.03; and 0.05. After applying feature selection, the attribute that produces the highest accuracy uses 6 attributes. The highest accuracy after applying feature selection reached 95.45% at a threshold of 0.02 with a ratio of 80:20 using the RBF kernel with parameters C = 100 and Gamma = Scale. The results showed that there was an increase in accuracy after applying feature selection

Copyrights © 2025






Journal Info

Abbrev

bids

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering Engineering

Description

The Bulletin of Informatics and Data Science journal discusses studies in the fields of Informatics, DSS, AI, and ES, as a forum for expressing research results both conceptually and technically related to Data ...