International Journal of Electrical and Computer Engineering
Vol 7, No 1: February 2017

An Improved Similarity Matching based Clustering Framework for Short and Sentence Level Text

M. John Basha (P.T.R College of Engineering & Technology, Madurai, Tamil Nadu, India)
K.P. Kaliyamurthie (Bharath University, Chennai, Tamil Nadu, India)



Article Info

Publish Date
01 Feb 2017

Abstract

Text clustering plays a key role in navigation and browsing process. For an efficient text clustering, the large amount of information is grouped into meaningful clusters. Multiple text clustering techniques do not address the issues such as, high time and space complexity, inability to understand the relational and contextual attributes of the word, less robustness, risks related to privacy exposure, etc. To address these issues, an efficient text based clustering framework is proposed. The Reuters dataset is chosen as the input dataset. Once the input dataset is preprocessed, the similarity between the words are computed using the cosine similarity. The similarities between the components are compared and the vector data is created. From the vector data the clustering particle is computed. To optimize the clustering results, mutation is applied to the vector data. The performance the proposed text based clustering framework is analyzed using the metrics such as Mean Square Error (MSE), Peak Signal Noise Ratio (PSNR) and Processing time. From the experimental results, it is found that, the proposed text based clustering framework produced optimal MSE, PSNR and processing time when compared to the existing Fuzzy C-Means (FCM) and Pairwise Random Swap (PRS) methods.

Copyrights © 2017






Journal Info

Abbrev

IJECE

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

International Journal of Electrical and Computer Engineering (IJECE, ISSN: 2088-8708, a SCOPUS indexed Journal, SNIP: 1.001; SJR: 0.296; CiteScore: 0.99; SJR & CiteScore Q2 on both of the Electrical & Electronics Engineering, and Computer Science) is the official publication of the Institute of ...