Handika, Ferdi Setyo
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Journal of Information Systems and Informatics

The Feature Selection vs Dimensionality Reduction for Steam Game Metadata Classification: An Ensemble Learning Study Handika, Ferdi Setyo; Yulianto, Lili Dwi; Andryana, Septi
Journal of Information System and Informatics Vol 8 No 1 (2026): February
Publisher : Asosiasi Doktor Sistem Informasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.63158/journalisi.v8i1.1456

Abstract

Optimizing noisy Steam game metadata is essential for accurate binary classification. This study compares feature selection (MI) and dimensionality reduction (PCA, LDA) using a dataset of 55,144 Steam reviews and four ensemble algorithms, evaluated through Stratified 5-Fold Cross-Validation. The results show that the 125-feature baseline achieved the highest accuracy of 0.7728 with CatBoost. Feature selection (FS_10) maintained competitive performance with an accuracy of 0.7449, while LDA, after optimization, achieved 0.7281. In contrast, PCA significantly hindered performance (0.6963) due to the inability of linear transformations to preserve the discriminative power of one-hot encoded categorical features, which ensemble models handle better in their original form. These findings highlight the importance of preserving original features, especially in low-to-medium dimensional metadata, where feature selection outperforms dimensionality reduction in maintaining predictive integrity. High accuracy is crucial for developers to track product reception and for platforms to improve recommendation systems that influence user purchasing decisions. The study concludes that for Steam game metadata, strategic feature selection is superior to dimensionality reduction for maintaining classification performance.