The development of transformer-based natural language processing (NLP) has brought significant progress in question answering (QA) systems. This study compares three main models, namely BERT, Sequence-to-Sequence (S2S), and Generative Pretrained Transformer (GPT), in understanding and answering context-based questions using the SQuAD 2.0 dataset that has been translated into Indonesian. This research uses the SEMMA (Sample, Explore, Modify, Model, Assess) method to ensure the analysis process runs systematically and efficiently. The model was tested with exact match (EM), F1-score, and ROUGE evaluation metrics. Results show that BERT excels with an Exact Match score of 99.57%, an F1-score of 99.57%, ROUGE-1 of 97%, ROUGE-2 of 30%, and ROUGE-L of 97%, outperforming S2S and GPT models. This study proves that BERT is more effective in understanding and capturing Indonesian context in QA tasks. This research offers explanations for the implementation of Indonesian-based QA and can be a reference in the development of more accurate and efficient NLP systems.
Copyrights © 2025