ComTech: Computer, Mathematics and Engineering Applications
Vol. 14 No. 2 (2023): ComTech

Fuzzy C-Means in Content-Based Document Clustering for Grouping General Websites Based on Their Main Page Contents

Sri Probo Aditiyo (Brawijaya University)
Eni Sumarminingsih (Brawijaya University)
Rahma Fitriani (Brawijaya University)



Article Info

Publish Date
14 Nov 2023

Abstract

The research aimed to use Fuzzy C-Means clustering in content-based document clustering to classify general websites based on their content. The data used were a table ranking of the most visited websites for Indonesia, taken from https://dataforseo.com/top-1000-websites/ on September 24th, 2022. The research was conducted with two different cases using Fuzzy C-Means clustering, which had two different iteration parameter values, namely 100 and 200 in maximum iteration. The research results on Fuzzy C-Means clustering in content-based document clustering are based on the two cases. These different maximum iteration parameters result in a different amount of website name data in the cluster. They are formed in the first and second clusters only. However, in the other clusters, the numbers are all the same. The results of the cluster research are validated using the silhouette coefficient, with case no. 1 and no. 2 values being 0,977783879 and 0,977788457. The use of Fuzzy C-Means clustering in content-based document clustering has an excellent performance when this method is applied to group general websites based on their content. With that result, content-based clustering can be also applied in other cases. Hence, the results can be considered to be applied to other cases for content-based clustering in the future.

Copyrights © 2023






Journal Info

Abbrev

comtech

Publisher

Subject

Computer Science & IT Engineering Mathematics

Description

The journal invites professionals in the world of education, research, and entrepreneurship to participate in disseminating ideas, concepts, new theories, or science development in the field of Information Systems, Architecture, Civil Engineering, Computer Engineering, Industrial Engineering, Food ...