Journal of Security, Computer, Information, Embedded, Network and Intelligence System
Vol. 1, No. 2 (Desember 2023)

Clustering Tweets Data on Twitter Social Media using K-Means Method

Dewi Fatmarani Surianto (Universitas Negeri Makassar)



Article Info

Publish Date
09 Dec 2023

Abstract

Twitter, as a popular social media with millions to billions of global users, stores a wide variety of information. This study focuses on the use of Text Mining to analyze tweet content through the application of clustering techniques, specifically using the K-Means algorithm. The implementation process involves several stages of text processing, including casefolding, tokenizing, stopword removal, and stemming. Feature extraction is performed to provide input for the K-Means algorithm. The clustering evaluation uses the Silhouette coefficient method. The test results show that different K values result in a variation of the silhouette value. In a particular test scenario, a value of K=2 resulted in a silhouette of 0.5000421, K=5 had a value of 0.0501051, and K=9 had a value of 0.501893. From these values, the data structure of the dataset taken can be categorized as medium structure, because the silhouette value is in the range of 0.5 to 0.7. These results show that cluster quality is influenced by the K value, with the silhouette value being the main determinant.

Copyrights © 2023






Journal Info

Abbrev

SCIENTIST

Publisher

Subject

Computer Science & IT Engineering

Description

Articles submitted in SCIENTIST Scientific Journal will be examined by the editorial board. If the article matches the scope and style of writing an SCIENTIST Scientific Journal, the editorial board will assign the article to the reviewer. Reviewers name cannot be seen by the author. The author only ...