Vadivel Elanangai
St. Peter's Institute of Higher Education and Research

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Exploring the performance of feature selection method using breast cancer dataset Tsehay Admassu Assegie; Ravulapalli Lakshmi Tulasi; Vadivel Elanangai; Napa Komal Kumar
Indonesian Journal of Electrical Engineering and Computer Science Vol 25, No 1: January 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v25.i1.pp232-237

Abstract

Breast cancer is the most common type of cancer occurring mostly in females. In recent years, many researchers have devoted to automate diagnosis of breast cancer by developing different machine learning model. However, the quality and quantity of feature in breast cancer diagnostic dataset have significant effect on the accuracy and efficiency of predictive model. Feature selection is effective method for reducing the dimensionality and improving the accuracy of predictive model. The use of feature selection is to determine feature required for training model and to remove irrelevant and duplicate feature. Duplicate feature is a feature that is highly correlated to another feature. The objective of this study is to conduct experimental research on three different feature selection methods for breast cancer prediction. Sequential, embedded and chi-square feature selection are implemented using breast cancer diagnostic dataset. The study compares the performance of sequential embedded and chi-square feature selection on test set. The experimental result evidently shows that sequential feature selection outperforms as compared to chi-square (X2) statistics and embedded feature selection. Overall, sequential feature selection achieves better accuracy of 98.3% as compared to chi-square (X2) statistics and embedded feature selection.
Evaluation of feature scaling for improving the performance of supervised learning methods Tsehay Admassu Assegie; Vadivel Elanangai; Josephin Shermila Paulraj; Mani Velmurugan; Daya Florance Devesan
Bulletin of Electrical Engineering and Informatics Vol 12, No 3: June 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v12i3.5170

Abstract

This article evaluates the performance of the support vector machine (SVM), decision tree (DT), and random forest (RF) on the dataset that contains the medical records of 299 patients with heart failure (HF) collected at the Faisalabad Institute of Cardiology and the Allied hospital in Pakistan. The dataset contains 13 descriptive features of physical, clinical, and lifestyle information. The study compared the performance of three classification algorithms employing pre-processing techniques such as min-max scaling, and principal component analysis (PCA). The simulation result shows that the performance of the DT, and RF decreased with dimensionality reduction while the SVM improved with dimensionality reduction. The SVM achieved 84.44%. Thus, feature scaling improves the performance of the SVM. The RF performs at 82.22%, the DT at 81.11%, and the SVM shows an improvement of 1.64% with scaled features, compared to the original dataset.