JURTEKSI
Vol 10, No 1 (2023): Desember 2023

ABSTRACTIVE-BASED AUTOMATIC TEXT SUMMARIZATION ON INDONESIAN NEWS USING GPT-2

Aini Nur Khasanah (Universitas Amikom Yogyakarta)
Mardhiya Hayaty (Universitas Amikom Yogyakarta)



Article Info

Publish Date
06 Dec 2023

Abstract

Automatic text summarization is challenging research in natural language processing, aims to obtain important information quickly and precisely. There are two main approach techniques for text summary: abstractive and extractive summary. Abstractive Summarization generates new and more natural words, but the difficulty level is higher and more challenging. In previous studies, RNN and its variants are among the most popular Seq2Seq models in text summarization. However, there are still weaknesses in saving memory; gradients are lost in long sentences so resulting in a decrease in lengthy text summaries. This research proposes a Transformer model with an Attention mechanism that can fetch important information, solve parallelization problems, and summarize long texts. The Transformer model we propose is GPT-2. GPT-2 uses decoders to predict the next word using the pre-trained model from w11wo/indo-gpt2-small, implemented on the Indosum Indonesian dataset. Evaluation assessment of the model performance using ROUGE evaluation. The study's results get an average result recall for R-1, R-2, and R-L were 0.61, 0.51, and 0.57, respectively. The summary results can paraphrase sentences, but some still use the original words from the text. Future work increase the amount of data from the dataset to improve the result of more new sentence paraphrases.

Copyrights © 2023






Journal Info

Abbrev

jurteksi

Publisher

Subject

Computer Science & IT

Description

JURTEKSI (Jurnal Teknologi dan Sistem Informasi) is a scientific journal which is published by STMIK Royal Kisaran. This journal published twice a year on December and June. This journal contains a collection of research in information technology and computer ...