Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Scientific Journal of Informatics

Comparison of Discriminant Analysis and Support Vector Machine on Mixed Categorical and Continuous Independent Variables for COVID-19 Patients Data Haikal, Husnul Aris; Wigena, Aji Hamim; Sadik, Kusman; Efriwati, Efriwati
Scientific Journal of Informatics Vol 11, No 1 (2024): February 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v11i1.48565

Abstract

Purpose: Numerous factors can affect the duration of COVID-19 recovery. One method involves utilizing natural herbal medication. This study seeks to determine the variables influencing the duration of COVID-19 recovery and to compare discriminant analysis and support vector machine models using COVID-19 patient data from West Sumatra.Methods: Two data mining methods, Discriminant Analysis and Support Vector Machine with different types of kernels (linear, polynomial, and radial basis function), were employed to categorize the time of COVID-19 recovery in this work. The study utilized 428 data points, with 75% allocated for training data and 25% for testing data. The independent factors were evaluated by determining the selection variables' information value (IV) to gauge their influence on the dependent variable. Data resampling techniques were employed to tackle the problem of data imbalance. This study employs data resampling techniques, including undersampling, oversampling, and SMOTE. The balancing accuracy of Discriminant Analysis and Support Vector Machine was examined.Result: The Discriminant Analysis with SMOTE achieved a balanced accuracy of 66.50%, outperforming the linear kernel Support Vector Machine with SMOTE, which had a balanced accuracy of 63.20% in this dataset.Novelty: This study assessed the novelty, originality, and value by comparing Discriminant Analysis and SVM algorithms with categorical and continuous independent variables. This research explores techniques for managing imbalanced data using undersampling, oversampling, and SMOTE, with variable selection based on information value assessment.