Claim Missing Document
Check
Articles

Found 5 Documents
Search
Journal : Journal of ICT Research and Applications

Automatic Tailored Multi-Paper Summarization based on Rhetorical Document Profile and Summary Specification Masayu Leylia Khodra; Dwi Hendratmo Widyantoro; E. Aminudin Aziz; Bambang Riyanto Trilaksono
Journal of ICT Research and Applications Vol. 6 No. 3 (2012)
Publisher : LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.2012.6.3.4

Abstract

In  order  to  assist  researchers  in  addressing  time  constraint  and  low relevance  in  using  scientific  articles,  an  automatic  tailored  multi-paper summarization  (TMPS)  is  proposed.  In  this  paper,  we  extend  Teufel's  tailored summary  to  deal  with  multi-papers  and  more  flexible  representation  of  user information needs. Our TMPS extracts Rhetorical Document Profile (RDP) from each paper and  presents a summary based on user information needs.  Building Plan  Language  (BPLAN)  is  introduced  as  a  formalization  of  Teufel's  building plan  and  used  to  represent summary  specification,  which  is  more  flexible representation user information needs. Surface repair is embedded within the BPLAN  for  improving  the  readability  of  extractive summary.  Our  experiment shows that the average performance of RDP extraction module is 94.46%, which promises  high  quality  of  extracts  for  summary  composition.  Generality evaluation  shows  that  our  BPLAN  is  flexible  enough  in  composing  various forms  of summary.  Subjective  evaluation  provides evidence that  surface repair operators can improve the resulting summary readability.
Automatic Title Generation in Scientific Articles for Authorship Assistance: A Summarization Approach Jan Wira Gotama Putra; Masayu Leylia Khodra
Journal of ICT Research and Applications Vol. 11 No. 3 (2017)
Publisher : LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.res.appl.2017.11.3.3

Abstract

This paper presents a studyon automatic title generation for scientific articles considering sentence information types known as rhetorical categories. A title can be seenas a high-compression summary of a document. A rhetorical category is an information type conveyed by the author of a text for each textual unit, for example: background, method, or result of the research. The experiment in this studyfocused on extracting the research purpose and research method information for inclusion in a computer-generated title. Sentences are classifiedinto rhetorical categories, after which these sentences are filtered using three methods. Three title candidates whose contents reflect the filtered sentencesare then generated using a template-based or an adaptive K-nearest neighbor approach. The experiment was conducted using two different dataset domains: computational linguistics and chemistry. Our study obtained a 0.109-0.255 F1-measure score on average for computer-generated titles compared to original titles. In a human evaluation the automatically generated titles were deemed 'relatively acceptable' in the computational linguistics domain and 'not acceptable' in the chemistry domain. It can be concluded that rhetorical categories have unexplored potential to improve the performance of summarization tasks in general.
Word Embedding for Rhetorical Sentence Categorization on Scientific Articles Ghoziyah Haitan Rachman; Masayu Leylia Khodra; Dwi Hendratmo Widyantoro
Journal of ICT Research and Applications Vol. 12 No. 2 (2018)
Publisher : LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.res.appl.2018.12.2.5

Abstract

A common task in summarizing scientific articles is employing the rhetorical structure of sentences. Determining rhetorical sentences itself passes through the process of text categorization. In order to get good performance, some works in text categorization have been done by employing word embedding. This paper presents rhetorical sentence categorization of scientific articles by using word embedding to capture semantically similar words. A comparison of employing Word2Vec and GloVe is shown. First, two experiments are evaluated using five classifiers, namely Naïve Bayes, Linear SVM, IBK, J48, and Maximum Entropy. Then, the best classifier from the first two experiments was employed. This research showed that Word2Vec CBOW performed better than Skip-Gram and GloVe. The best experimental result was from Word2Vec CBOW for 20,155 resource papers from ACL-ARC, features from Teufel and the previous label feature. In this experiment, Linear SVM produced the highest F-measure performance at 43.44%.
Using Graph Pattern Association Rules on Yago Knowledge Base Wahyudi Wahyudi; Masayu Leylia Khodra; Ary Setijadi Prihatmanto; Carmadi Machbub
Journal of ICT Research and Applications Vol. 13 No. 2 (2019)
Publisher : LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.res.appl.2019.13.2.6

Abstract

The use of graph pattern association rules (GPARs) on the Yago knowledge base is proposed. Extending association rules for itemsets, GPARS can help to discover regularities between entities in a knowledge base. A rule-generated graph pattern (RGGP) algorithm was used for extracting rules from the Yago knowledge base and a GPAR algorithm for creating the association rules. Our research resulted in 1114 association rules, with the value of standard confidence at 50.18% better than partial completeness assumption (PCA) confidence at 49.82%. Besides that the computation time for standard confidence was also better than for PCA confidence.
The Evaluation of DyHATR Performance for Dynamic Heterogeneous Graphs Nasy`an Taufiq Al Ghifari; Gusti Ayu Putri Saptawati; Masayu Leylia Khodra; Benhard Sitohang
Journal of ICT Research and Applications Vol. 17 No. 2 (2023)
Publisher : DRPM - ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.res.appl.2023.17.2.7

Abstract

Dynamic heterogeneous graphs can represent real-world networks. Predicting links in these graphs is more complicated than in static graphs. Until now, research interest of link prediction has focused on static heterogeneous graphs or dynamically homogeneous graphs. A link prediction technique combining temporal RNN and hierarchical attention has recently emerged, called DyHATR. This method is claimed to be able to work on dynamic heterogeneous graphs by testing them on four publicly available data sets (Twitter, Math-Overflow, Ecomm, and Alibaba). However, after further analysis, it turned out that the four data sets did not meet the criteria of dynamic heterogeneous graphs. In the present work, we evaluated the performance of DyHATR on dynamic heterogeneous graphs. We conducted experiments with DyHATR based on the Yelp data set represented as a dynamic heterogeneous graph consisting of homogeneous subgraphs. The results show that DyHATR can be applied to identify link prediction on dynamic heterogeneous graphs by simultaneously capturing heterogeneous information and evolutionary patterns, and then considering them to carry out link predicition. Compared to the baseline method, the accuracy achieved by DyHATR is competitive, although the results can still be improved.