ILKOMNIKA: Journal of Computer Science and Applied Informatics
Vol 7 No 3 (2025): Volume 7, Number 3, December 2025

A Comparative Study of Extractive and Generative Approaches for Indonesian Meeting Minutes Summarization

Harliana, Harliana (Unknown)
Sismoro, Heri (Unknown)



Article Info

Publish Date
31 Dec 2025

Abstract

This study compares extractive and generative approaches for automatic summarization of Indonesian meeting minutes. Our main scientific contribution is an empirical claim that, under strict zero-shot conditions and without domain adaptation, simple extractive baselines are more reliable than off-the-shelf generative models in preserving both decision content and meeting-context cues (actors/roles). We evaluate three extractive baselines (Lead-3, Random-Extract, TextRank-Simple) against an Indonesian GPT-2 model tested under multiple decoding configurations and an mT5 sequence-to-sequence model in a zero-shot setting. Experiments utilize 30 manually curated meeting minutes. The dataset size is intentionally limited because meeting minutes are heterogeneous and require carefully constructed reference summaries to ensure evaluation validity; the study is positioned as a controlled diagnostic comparison rather than a training or adaptation effort. Performance is measured using ROUGE-1/2/L, summary–to–reference length ratios, simple audits of gender and professional role mentions, correlations between decoding parameters and ROUGE, and paired t-tests. Results show that extractive methods achieve higher and more stable ROUGE scores than zero-shot generative models. TextRank-Simple and Random-Extract perform best, while all GPT-2 configurations remain substantially lower, and mT5 zero-shot fails to align with references. Decoding parameters exhibit only weak correlations with generative performance, and paired t-tests confirm statistically significant differences (p < 0.05). Overall, extractive approaches remain the most dependable choice without in-domain fine-tuning, while generative models are more suitable with adaptation or hybrid strategies.This study compares extractive and generative approaches for automatic summarization of Indonesian meeting minutes. Our main scientific contribution is an empirical claim that, under strict zero-shot conditions and without domain adaptation, simple extractive baselines are more reliable than off-the-shelf generative models in preserving both decision content and meeting-context cues (actors/roles). We evaluate three extractive baselines (Lead-3, Random-Extract, TextRank-Simple) against an Indonesian GPT-2 model tested under multiple decoding configurations and an mT5 sequence-to-sequence model in a zero-shot setting. Experiments utilize 30 manually curated meeting minutes. The dataset size is intentionally limited because meeting minutes are heterogeneous and require carefully constructed reference summaries to ensure evaluation validity; the study is positioned as a controlled diagnostic comparison rather than a training or adaptation effort. Performance is measured using ROUGE-1/2/L, summary–to–reference length ratios, simple audits of gender and professional role mentions, correlations between decoding parameters and ROUGE, and paired t-tests. Results show that extractive methods achieve higher and more stable ROUGE scores than zero-shot generative models. TextRank-Simple and Random-Extract perform best, while all GPT-2 configurations remain substantially lower, and mT5 zero-shot fails to align with references. Decoding parameters exhibit only weak correlations with generative performance, and paired t-tests confirm statistically significant differences (p < 0.05). Overall, extractive approaches remain the most dependable choice without in-domain fine-tuning, while generative models are more suitable with adaptation or hybrid strategies.

Copyrights © 2025






Journal Info

Abbrev

ilkomnika

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

ILKOMNIKA: Journal of Computer and Applied Informatics is is a peer reviewed open-access journal. The journal invites scientists and engineers throughout the world to exchange and disseminate theoretical and practice-oriented topics of computer science and applied informatics which covers five (5) ...