JIKSI (Jurnal Ilmu Komputer dan Sistem Informasi)
Vol 2, No 2 (2014): Jurnal Ilmu Komputer dan Sistem Informasi

CLUSTERING K-MEANS UNTUK SISTEM TANYA JAWAB BAHASA INDONESIA BIDANG KESEHATAN

Steven Muliadi (Unknown)
Viny Christanti (Unknown)



Article Info

Publish Date
31 Aug 2014

Abstract

Question and Answering (QA) system is a system to answer question based on collections of unstructured text documents in the form of natural language or human language. In general, QA system consists of four stages, i.e. question analysis, document selection, passage retrieval, and answer extraction. In this study, we added two processes, i.e. documents clustering and passage clustering. Clustering K-Means is used for this study. Naive Bayes Classification is used for document or passage selection. Passage building is done with Dynamic Passage Partitioning. Document selection is done with Lucene. The experiments was done using 100 questions from 1000 Indonesian Health Documents. Test results show that system without clustering has the best accuracy 63 %. System produces the best result with the use of 5 of the most relevant documents, 5 passage with the highest score, and 10 answer with the closest distance. Key words Clustering K-Means, Dynamic Passage Partitioning, Health, Information Retrieval, Naive Bayes Classification, Question Answering

Copyrights © 2014






Journal Info

Abbrev

jiksi

Publisher

Subject

Computer Science & IT Mathematics Other

Description

Jurnal Ilmu Komputer dan Sistem Informasi (JIKSI) diterbitkan oleh Fakultas Teknologi Informasi Universitas Tarumanagara (FTI Untar) Jakarta sebagai media publikasi karya ilmiah mahasiswa program studi Teknik Informatika dan Sistem Informasi FTI Untar. Karya-karya ilmiah yang dihasilkan berupa hasil ...