TELKOMNIKA (Telecommunication Computing Electronics and Control)
Vol 14, No 1: March 2016

Improving Multi-Document Summary Method Based on Sentence Distribution

Aminul Wahib (Institut Teknologi Sepuluh Nopember)
Agus Zainal Arifin (Institut Teknologi Sepuluh Nopember)
Diana Purwitasari (Institut Teknologi Sepuluh Nopember)



Article Info

Publish Date
01 Mar 2016

Abstract

Automatic multi-document summaries had been developed by researchers. The method used to select sentences from the source document would determine the quality of the summary result. One of the most popular methods used in weighting sentences was by calculating the frequency of occurrence of words forming the sentences. However, choosing sentences with that method could lead to a chosen sentence which didn't represent the content of the source document optimally. This was because the weighting of sentences was only measured by using the number of occurrences of words. This study proposed a new strategy of weighting sentences based on sentences distribution to choose the most important sentences which paid much attention to the elements of sentences that were formed as a distribution of words. This method of sentence distribution enables the extraction of an important sentence in multi-document summarization which served as a strategy to improve the quality of sentence summaries. In that respect were three concepts used in this study: (1) clustering sentences with similarity based histogram clustering, (2) ordering cluster by cluster importance and (3) selection of important sentence by sentence distribution. Results of experiments showed that the proposed method had a better performance when compared with SIDeKiCK and LIGI methods. Results of ROUGE-1 showed the proposed method increasing 3% compared with the SIDeKiCK method and increasing 5.1% compared with LIGI method. Results of ROUGE-2 proposed method increase 13.7% compared with the SIDeKiCK and increase 14.4% compared with LIGI method.

Copyrights © 2016






Journal Info

Abbrev

TELKOMNIKA

Publisher

Subject

Computer Science & IT

Description

Submitted papers are evaluated by anonymous referees by single blind peer review for contribution, originality, relevance, and presentation. The Editor shall inform you of the results of the review as soon as possible, hopefully in 10 weeks. Please notice that because of the great number of ...