JIKSI (Jurnal Ilmu Komputer dan Sistem Informasi)
Vol 4, No 2 (2016): Jurnal Ilmu Komputer dan Sistem Informasi

PENGEMBANGAN SISTEM AGREGATOR BERITA BAHASA INDONESIA MENGGUNAKAN CONTENT EXTRACTION DAN HIERARCHICAL AGGLOMERATIVE CLUSTERING

Stenly Tirta Wijaya (Unknown)
Viny Christanti Mawardi (Unknown)
Janson Hendryli (Unknown)



Article Info

Publish Date
09 Jan 2017

Abstract

The main focus of this study is to develop system to aggregate Indonesian online newspaper and cluster it according to its topic automatically. The system use content extraction to get the main content of articles and Hierarchical Agglomerative Clustering to group articles by its topic with Dice Similarity Coefficient for similarity measure. To determine the cutting point, we cut dendrogram where the gap between two successive combination similarities is largest. Additionally, we add threshold to limit cutting area to improve cluster result. We use Standard Boolean Model for searching feature and Silhouette to evaluate cluster results. Test results using 998 articles shows that limiting cutting area with 0.1 and 0.5 can produce highest average silhouette value 0.264.

Copyrights © 2016






Journal Info

Abbrev

jiksi

Publisher

Subject

Computer Science & IT Mathematics Other

Description

Jurnal Ilmu Komputer dan Sistem Informasi (JIKSI) diterbitkan oleh Fakultas Teknologi Informasi Universitas Tarumanagara (FTI Untar) Jakarta sebagai media publikasi karya ilmiah mahasiswa program studi Teknik Informatika dan Sistem Informasi FTI Untar. Karya-karya ilmiah yang dihasilkan berupa hasil ...