I Gusti Agung Putu Mahendra
Politeknik Negeri Bengkalis

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Improving FAQ Retrieval for Academic Regulations Using Semantic Embeddings and LLM Question Augmentation Fajri Profesio Putra; I Gusti Agung Putu Mahendra; Agus Tedyyana; Muhammad Noor
Jurnal Testing dan Implementasi Sistem Informasi Vol. 4 No. 1 (2026): Jurnal Testing dan Implementasi Sistem Informasi
Publisher : Lembaga Riset dan Inovasi Almatani

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.55583/jtisi.v4i1.2176

Abstract

Academic regulations in higher education are often documented in lengthy and formal handbooks, making it difficult for students to find relevant information using everyday language. This study developed a semantic FAQ retrieval system for academic regulations using IndoSBERT and question augmentation. The FAQ corpus was constructed from official academic and internship documents, resulting in 92 FAQ entries across 33 topical categories. Seed questions were generated from category–keyword pairs and expanded using simple rule-based augmentation and FLAN-T5-based paraphrasing. The dataset was evaluated using an 80:10:10 train–validation–test split. IndoSBERT was fine-tuned with Multiple Negatives Ranking Loss under three configurations: baseline, baseline with simple augmentation, and baseline with simple plus LLM-based augmentation. Retrieval performance was measured using Recall@1, Recall@3, Recall@5, and Mean Reciprocal Rank. The best result was achieved by the simple plus LLM augmentation configuration, with Recall@1 of 0.7848, Recall@5 of 0.8987, and MRR of 0.8396. These findings show that LLM-based question augmentation improves semantic retrieval robustness while keeping answers grounded in curated academic regulations.