Toleva, Borislava
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

The bootstrap procedure for selecting the number of principal components in PCA Toleva, Borislava
International Journal of Informatics and Communication Technology (IJ-ICT) Vol 14, No 3: December 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijict.v14i3.pp1136-1145

Abstract

The initial step in determining the number of principal components for both classification and regression involves evaluating how much each component contributes to the total variance in the data. Based on this analysis, a subset of components that explains the highest percentage of variance is typically selected. However, multiple valid combinations may exist, and the final choice is often made manually by the researcher. This study introduces a novel yet straightforward algorithm for the automatic selection of the number of principal components. By integrating ANOVA and bootstrapping with principal component analysis (PCA), the proposed method enables automatic component selection in classification tasks. The algorithm is evaluated using three publicly available datasets and applied with both decision tree and support vector machine (SVM) classifiers. Results indicate that this automated procedure not only eliminates researcher bias in selecting components but also improves classification accuracy. Unlike traditional methods, it selects a single optimal combination of principal components without manual intervention, offering a new and efficient approach to PCAbased model development.
Feature selection for support vector machines in imbalanced data Toleva, Borislava; Ivanov, Ivan; Hooper, Vincent
Bulletin of Electrical Engineering and Informatics Vol 14, No 4: August 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v14i4.9556

Abstract

Addressing the effects of class imbalance on feature selection models has become an increasingly important focus in academic research. This study introduces a novel support vector machine (SVM)-based algorithm specifically designed to handle class imbalance during the feature selection process. Using the Taiwan bankruptcy dataset as a case study, the algorithm incorporates the ExtraTreeClassifier() to manage class imbalance and identify a reduced set of relevant variables. To validate the selected features, SVM is applied within the imbalanced data context. Subsequently, analysis of variance (ANOVA) ranking is employed to further refine the variable set to three key features. An SVM model tailored for class imbalance is then constructed to assess the effectiveness of the final feature set. The proposed model significantly outperforms existing approaches in terms of classification performance. Specifically, it achieves a Type I error of 1.17% and a Type II error of 22.9%, compared to 4.4% and 39.4% reported in prior research. In terms of overall accuracy, our method reaches 83.1%, surpassing the 81.3% achieved by earlier studies. These results demonstrate that the proposed feature selection algorithm not only improves SVM accuracy but also outperforms other feature selection techniques when used in conjunction with SVMs, particularly under conditions of class imbalance.