Muhammad Akmaluddin Az Zamrudi
UNIVERSITAS ISLAM NEGERI MAULANA MALIK IBRAHIM MALANG

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Evaluasi Akurasi dan Presisi Large Language Model (LLM) dalam Generasi User Story untuk Perangkat Lunak Maulana Nur Rokhim; Muhammad Akmaluddin Az Zamrudi; Muhammad Ainul Yaqin
Jurnal Ilmiah Informatika Vol. 10 No. 1 (2025): Jurnal Ilmiah Informatika
Publisher : Department of Science and Technology Ibrahimy University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35316/jimi.v10i1.48-60

Abstract

Generating effective user stories is essential yet time-consuming in software development, especially in large scale Agile projects. This study evaluates the performance of three Large Language Models (LLMs): ChatGPT-4.0, DeepSeek, and Gemini 2.5 in generating user stories automatically. The objective is to compare their accuracy and precision to determine the most suitable model for automating requirements documentation. Using seven test prompts from various industry domains, each model generated user stories evaluated with BLEU-4, ROUGE-L F1, and METEOR metrics. Results show that while all models produced structurally valid outputs, Gemini 2.5 achieved the highest average scores (0.386), surpassing DeepSeek (0.355) and ChatGPT (0.348). Gemini 2.5 demonstrated superior consistency, clarity, and semantic completeness. This research contributes a performance benchmark for LLMs in software requirement generation and highlights the practical benefits of LLM-based automation over manual methods, including speed, consistency, and adaptability. Gemini 2.5 is recommended as the optimal model for generating user stories in software engineering contexts.