Claim Missing Document
Check
Articles

Found 1 Documents
Search

Complete agglomerative hierarchy document’s clustering based on fuzzy luhn’s gibbs latent dirichlet allocation P. M. Prihatini; I. K. G. D. Putra; I. A. D. Giriantari; M. Sudarma
International Journal of Electrical and Computer Engineering (IJECE) Vol 9, No 3: June 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (651.243 KB) | DOI: 10.11591/ijece.v9i3.pp2103-2111

Abstract

Agglomerative hierarchical is a bottom up clustering method, where the distances between documents can be retrieved by extracting feature values using a topic-based latent dirichlet allocation method. To reduce the number of features, term selection can be done using Luhn’s Idea. Those methods can be used to build the better clusters for document. But, there is less research discusses it. Therefore, in this research, the term weighting calculation uses Luhn’s Idea to select the terms by defining upper and lower cut-off, and then extracts the feature of terms using gibbs sampling latent dirichlet allocation combined with term frequency and fuzzy Sugeno method. The feature values used to be the distance between documents, and clustered with single, complete and average link algorithm. The evaluations show the feature extraction with and without lower cut-off have less difference. But, the topic determination of each term based on term frequency and fuzzy Sugeno method is better than Tsukamoto method in finding more relevant documents. The used of lower cut-off and fuzzy Sugeno gibbs latent dirichlet allocation for complete agglomerative hierarchical clustering have consistent metric values. This clustering method suggested as a better method in clustering documents that is more relevant to its gold standard.