Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer
Vol 3 No 11 (2019): November 2019

Ekstraksi Topik Dokumen Berita Menggunakan Term-Cluster Weighting dan Clustering Large Application (CLARA)

Rizal Maulana (Fakultas Ilmu Komputer, Universitas Brawijaya)
Sigit Adinugroho (Fakultas Ilmu Komputer, Universitas Brawijaya)
Sutrisno Sutrisno (Fakultas Ilmu Komputer, Universitas Brawijaya)



Article Info

Publish Date
29 Jan 2020

Abstract

The growth of technology makes it easy to get informations and a kind of informations is often used is news media. As technology growth, news can be spread through news portals in form of web-bases such as Kompas, Detik, Tempo, and many others. Users of information technology sometimes don't have time to read news all the time and sometime can't get the news that they need. One of many solution to solve the problem is to do clustering news documents and after that topic extraction is used to get get important topics from the news cluster. In this research using Clustering Large Application (CLARA) for the clustering algorithm because CLARA is an optimization of k-medoid which is better than k-means from various aspects and on topic extraction uses term-cluster weighting to calculate term weights in the cluster. The proses of this research is used text preprocessing documents so it become structured data, after that Singular Value Decomposition (SVD) used to decomose features. Then CLARA is used to clustering documents and for topic extraction is using term frequency-inverse cluster frequency (TF-ICF). Data in this research is secondary data that obtained from Kaggle website which is an English language news documents. The result of silhoette sore from using 226 documents and 2 clusters is 0,005. As for accuracy topic extraction is 1 with taken number topic from 1 to 10.

Copyrights © 2019






Journal Info

Abbrev

j-ptiik

Publisher

Subject

Computer Science & IT Control & Systems Engineering Education Electrical & Electronics Engineering Engineering

Description

Jurnal Pengembangan Teknlogi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya merupakan jurnal keilmuan dibidang komputer yang memuat tulisan ilmiah hasil dari penelitian mahasiswa-mahasiswa Fakultas Ilmu Komputer Universitas Brawijaya. Jurnal ini diharapkan dapat mengembangkan penelitian ...