The rapid growth of online news content poses challenges for readers to capture the core information quickly and accurately. This research proposes and implements an automated end-to-end pipeline that integrates three main stages: data acquisition, abstractive text summarization, and Named Entity Recognition  (NER). The mT5 model is employed to generate coherent and concise summaries, while the BERT model is applied to extract key entities, including persons, organizations, and locations. The pipeline was evaluated using 100 news articles from the Egindo portal. Experimental results show that the system achieves an average text reduction of 62.47%, with a ROUGE-1 F1 score of 0.473. For NER tasks, the pipeline reached a Micro-F1 score close to 0.70, outperforming traditional approaches such as TextRank and CRF. These results demonstrate that the integration of Transformer-based models within a structured pipeline significantly improves summarization quality and entity extraction accuracy. The study contributes a practical NLP solution for the Indonesian language, providing a functional prototype that can be applied to online media analysis and media intelligence applications.
Copyrights © 2026