Setia, Cuncun
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Pipeline NLP End-to-End untuk Peringkasan Abstraktif dan Ekstraksi Entitas Berita Berbahasa Indonesia Berbasis Model Transformer Setia, Cuncun; Rukhviyanti, Novi
Jurnal Informatika: Jurnal Pengembangan IT Vol 11, No 1 (2026)
Publisher : Politeknik Harapan Bersama

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30591/jpit.v11i1.10030

Abstract

The rapid growth of online news content poses challenges for readers to capture the core information quickly and accurately. This research proposes and implements an automated end-to-end pipeline that integrates three main stages: data acquisition, abstractive text summarization, and Named Entity Recognition  (NER). The mT5 model is employed to generate coherent and concise summaries, while the BERT model is applied to extract key entities, including persons, organizations, and locations. The pipeline was evaluated using 100 news articles from the Egindo portal. Experimental results show that the system achieves an average text reduction of 62.47%, with a ROUGE-1 F1 score of 0.473. For NER tasks, the pipeline reached a Micro-F1 score close to 0.70, outperforming traditional approaches such as TextRank and CRF. These results demonstrate that the integration of Transformer-based models within a structured pipeline significantly improves summarization quality and entity extraction accuracy. The study contributes a practical NLP solution for the Indonesian language, providing a functional prototype that can be applied to online media analysis and media intelligence applications.