Claim Missing Document
Check
Articles

Found 1 Documents
Search

Perbandingan Metode K-Means dan DBSCAN untuk Clustering Teks Pesan Menfess di Telegram Pangestu, Oktama; Abdul Hamid Arribathi; Nur Azizah; Rahmat Hidayat
TAMIKA: Jurnal Tugas Akhir Manajemen Informatika & Komputerisasi Akuntansi Vol 6 No 1 (2026): TAMIKA: Jurnal Tugas Akhir Manajemen Informatika & Komputerisasi Akuntansi
Publisher : Universitas Methodist Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46880/tamika.Vol6No1.pp20-29

Abstract

Menfess messages on Telegram are a form of anonymous communication that generates large amounts of informal text data with linguistic characteristics such as slang, abbreviations, and spelling variations, posing challenges for computational text analysis. This study compares the performance of K-Means and DBSCAN in clustering menfess messages using Sentence-BERT embedding through the paraphrase-multilingual-MiniLM-L12-v2 model across three text length scenarios: short (0–20 words), medium (55–128 words), and long (129–283 words), derived via word count clustering. Evaluation uses the Silhouette Score and Davies-Bouldin Index. For short texts, K-Means achieves 0.0804 and 3.8451, while DBSCAN produces 2 clusters with 0.3186, 1.3714, and 71.60% noise. For medium texts, K-Means obtains 0.1403 and 3.6490, while DBSCAN forms 1 cluster with 0.0450, 3.3405, and 74.60% noise. For long texts, K-Means obtains 0.0593 and 3.4552, while DBSCAN produces 2 clusters with 0.5492, 1.1069, and 79.67% noise. Results show that DBSCAN outperforms on short and long texts by evaluation metrics but produces very high noise across all scenarios, while K-Means demonstrates more stable performance by clustering all data without noise in every scenario.