The administration of thesis seminar and defense scheduling is often hampered by unstructured PDF formats, which increases manual workload and the risk of errors. This study aims to evaluate and compare the performance of three Gemini Flash model variants, namely Gemini 2.0 Flash-Lite, Gemini 2.0 Flash, and Gemini 2.5 Flash Preview, in automating schedule information extraction using a zero-shot prompting approach. The dataset consists of 87 PDF files containing thesis seminar and defense schedules (588 entries) from the 2023/2024 academic year, alongside 200 question scenarios executed in two different context formats: raw extracted text (TXT) and structured JSON data. Performance evaluation metrics include Precision, Recall, F1-score, Exact-Match, and inference latency per request. Experimental results indicate that Gemini 2.5 Flash Preview achieves average F1-scores above 0.98 in both contexts with approximately 3.9 seconds latency. Conversely, smaller-capacity variants (Gemini 2.0 Flash and Flash-Lite) showed more significant performance gains using the JSON format compared to raw text, especially on complex question types such as multi-attribute filtering and list retrieval. Through error analysis, the primary challenge identified was tasks requiring numeric aggregation and determination of superlative values, accounting for approximately 78% of total extraction failures, particularly for lightweight models. A paired t-test indicated no statistically significant difference between the two context formats (average F1 difference = 0.0077; p=0.48). This study recommends the use of explicit numeric prompting or rule-based post-processing when employing lightweight models to significantly improve the accuracy of academic schedule information extraction.
Copyrights © 2025