Purpose – This research aims to address the phenomenon of information overload on online news portals by developing an automated text summarization system capable of generating abstractive summaries while preserving essential entities. In addition, this research also aims to improve the coherence and quality of summaries compared to conventional extractive methods. Methods/approach – This research employs a quantitative approach with an experimental method conducted on 100 news articles regarding the Israel–Iran conflict collected from CNN via RSS. The proposed system integrates the mT5 model for abstractive summarization and the multilingual BERT model for Named Entity Recognition (NER). The stages encompass data acquisition, preprocessing, the preparation of reference summaries, automated summarization, entity extraction, and evaluation using reduction rates and ROUGE metrics. Findings – The research results show that the system is capable of producing summaries with an average reduction rate of 89.83%, such that the summary length is only approximately 10.17% of the original text. Evaluation indicates a ROUGE-1 value of 0.4095, ROUGE-2 of 0.2356, and ROUGE-L of 0.3442. The mT5 pipeline model yielded marginally superior ROUGE-1 and ROUGE-L scores, whereas the baseline mT5 model demonstrated a slight advantage in the ROUGE-2 metric. Conversely, the extractive TextRank method lagged significantly behind both transformer based models, particularly in generating fluent and contextually coherent summaries. Research limitations – This research has limitations in terms of data coverage, which still focuses on a single conflict domain, as well as entity classification errors due to lexical ambiguity and limitations in the model's contextual understanding, which may affect the generalization and accuracy of the system. Originality – This research offers an integration between abstractive summarization and entity extraction within a structured pipeline, there by producing summaries that are not only concise but also more informative and organized.
Copyrights © 2026