Garuda - Garba Rujukan Digital

Indonesian Journal of Electrical Engineering and Computer Science

Vol 16, No 2: November 2019

Norsyela Muhammad Noor Mathivanan (Universiti Teknologi MARA)
Nor Azura Md.Ghani (Universiti Teknologi MARA)
Roziah Mohd Janor (Universiti Teknologi MARA)

Publish Date
01 Nov 2019

The curse of dimensionality and the empty space phenomenon emerged as a critical problem in text classification. One way of dealing with this problem is applying a feature selection technique before performing a classification model. This technique helps to reduce the time complexity and sometimes increase the classification accuracy. This study introduces a feature selection technique using K-Means clustering to overcome the weaknesses of traditional feature selection technique such as principal component analysis (PCA) that require a lot of time to transform all the inputs data. This proposed technique decides on features to retain based on the significance value of each feature in a cluster. This study found that k-means clustering helps to increase the efficiency of KNN model for a large data set while KNN model without feature selection technique is suitable for a small data set. A comparison between K-Means clustering and PCA as a feature selection technique shows that proposed technique is better than PCA especially in term of computation time. Hence, k-means clustering is found to be helpful in reducing the data dimensionality with less time complexity compared to PCA without affecting the accuracy of KNN model for a high frequency data.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Indonesian Journal of Electrical Engineering and Computer Science

Website

Abbrev

IJEECS

Publisher

Institute of Advanced Engineering and Science

Subject

Description

...

Article Info

Abstract

A comparative study on dimensionality reduction between principal component analysis and k-means clustering

Article Info

Abstract