Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Indonesian Journal of Electrical Engineering and Computer Science

A comparative study on dimensionality reduction between principal component analysis and k-means clustering Norsyela Muhammad Noor Mathivanan; Nor Azura Md.Ghani; Roziah Mohd Janor
Indonesian Journal of Electrical Engineering and Computer Science Vol 16, No 2: November 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v16.i2.pp752-758

Abstract

The curse of dimensionality and the empty space phenomenon emerged as a critical problem in text classification. One way of dealing with this problem is applying a feature selection technique before performing a classification model. This technique helps to reduce the time complexity and sometimes increase the classification accuracy. This study introduces a feature selection technique using K-Means clustering to overcome the weaknesses of traditional feature selection technique such as principal component analysis (PCA) that require a lot of time to transform all the inputs data. This proposed technique decides on features to retain based on the significance value of each feature in a cluster. This study found that k-means clustering helps to increase the efficiency of KNN model for a large data set while KNN model without feature selection technique is suitable for a small data set. A comparison between K-Means clustering and PCA as a feature selection technique shows that proposed technique is better than PCA especially in term of computation time. Hence, k-means clustering is found to be helpful in reducing the data dimensionality with less time complexity compared to PCA without affecting the accuracy of KNN model for a high frequency data.