Amelia Sahira Rahma
Departemen Teknik InformatikaInstitut Teknologi Sepuluh Nopember Jl. Raya ITS Kampus Sukolilo, Surabaya, 60111, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

PENGGUNAAN DICTIONARY-BASED DAN CORPUS-BASED THESAURUS UNTUK PEMBOBOTAN TERM PADA PENGELOMPOKAN DOKUMEN BERITA BERBAHASA INDONESIA Amelia Sahira Rahma; Vit Zuraida; Dimas Fanny Hebrasianto Permadi
NJCA (Nusantara Journal of Computers and Its Applications) Vol 2, No 1 (2017): Juni 2017
Publisher : Computer Society of Nahdlatul Ulama (CSNU) Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36564/njca.v2i1.25

Abstract

Huge numbers of digital news document in Indonesian Language led to the need for automatic document clustering based on topic so readers would have an easier access to news articles in the same topic. One of the major problems in document clustering is low relevancy in the clustering result so the documents are not grouped based on their appropriate topic. This paper proposed a new term weighting method that employs combination of corpus-based thesaurus and dictionary-based thesaurus to consider conceptual similarity between terms. This method is evaluated using K-Means algorithm to 253 news document in Indonesian language.  Experimental results show that the proposed term weighting method is able to achieve good performance.