This study evaluates the performance of three Transformer models: Transformer from Scratch, BART (Bidirectional and Auto-Regressive Transformers), and BERT (Bidirectional Encoder Representations from Transformers) in the task of summarizing news documents. The evaluation results show that BERT excels in understanding the bidirectional context of text, with a ROUGE-1 value of 0.2471, ROUGE-2 of 0.1597, and ROUGE-L of 0.1597. BART shows strong ability in de-noising and producing coherent summaries, with a ROUGE-1 value of 0.5239, ROUGE-2 of 0.3517, and ROUGE-L of 0.3683. Transformer from Scratch, despite requiring large training data and computational resources, produces good performance when trained optimally, with ROUGE-1 scores of 0.7021, ROUGE-2 scores of 0.5652, and ROUGE-L scores of 0.6383. This evaluation provides insight into the strengths and weaknesses of each model in the context of news document summarization.
Copyrights © 2025