Bulletin of Informatics and Data Science
Vol 4, No 1 (2025): May 2025

Implementation of Feature Selection Information Gain in Support Vector Machine Method for Stroke Disease Classification

Fitri, Anisa (Unknown)
Afrianty, Iis (Unknown)
Budianita, Elvia (Unknown)
Kurnia Gusti, Siska (Unknown)



Article Info

Publish Date
31 May 2025

Abstract

Stroke is a disease with a high mortality and disability rate that requires early detection. However, the main challenge in the classification process of this disease is data imbalance and the large number of irrelevant features in the dataset. This study proposes a combination of Support Vector Machine (SVM) method with Information Gain feature selection technique and data balancing using Synthetic Minority Over-sampling Technique (SMOTE) to improve classification accuracy. The dataset used consists of 5,110 data with 10 variables and 1 label. Feature selection was performed with three threshold values (0.04; 0.01; and 0.0005), while SVM classification was tested on three different kernels: Linear, RBF, and Polynomial. Model evaluation was performed using Confusion Matrix and training and test data sharing using k-fold cross validation with k=10. The best results were obtained on the RBF kernel with Cost=100 and Gamma=5 parameters at an Information Gain threshold of 0.0005, with accuracy reaching 90.51%. These results show that the combination of techniques used aims to determine the variables that most affect SVM classification in detecting stroke disease

Copyrights © 2025






Journal Info

Abbrev

bids

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering Engineering

Description

The Bulletin of Informatics and Data Science journal discusses studies in the fields of Informatics, DSS, AI, and ES, as a forum for expressing research results both conceptually and technically related to Data ...