Garuda - Garba Rujukan Digital

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol 14, No 6: December 2025

Kosayakova, Aknur (Unknown)
Ildar, Kurmashev (Unknown)
Spada, Luigi La (Unknown)
Zeeshan, Nida (Unknown)
Bakyt, Makhabbat (Unknown)
Khuralay, Moldamurat (Unknown)
Abdirashev, Omirzak (Unknown)

Publish Date
01 Dec 2025

Large language models (LLMs) are widely deployed in settings where both reliability and efficiency matter. We present a calibrated, seed‑robust empirical comparison of an encoder fine‑tuned model (bidirectional encoder representations from transformers (BERT)‑base) and a decoder in‑context model (generative pre-trained transformer (GPT)‑2 small) across Stanford question answering dataset v2.0 (SQuAD v2.0) and general language understanding evaluation (GLUE)-multi-genre natural language inference (MNLI), Stanford sentiment treebank 2 (SST‑2). Beyond accuracy, we assess reliability (expected calibration error with reliability diagrams and confidence–coverage analysis) and efficiency (latency, memory, throughput) under matched conditions and three fixed seeds. BERT‑base yields higher accuracy and lower calibration error, while GPT‑2 narrows gaps under few‑shot prompting but remains more sensitive to prompt design and context length. Efficiency benchmarks show that decoder‑only prompting incurs near‑linear latency/memory growth with k‑shot exemplars, whereas fine‑tuned encoders maintain stable per‑example cost. These findings offer practical guidance on when to prefer fine‑tuning versus prompting and demonstrate that reliability must be evaluated alongside accuracy for risk‑aware deployment.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

IAES International Journal of Artificial Intelligence (IJ-AI)

Website

Abbrev

IJAI

Publisher

Institute of Advanced Engineering and Science

Subject

Computer Science & IT Engineering

Description

IAES International Journal of Artificial Intelligence (IJ-AI) publishes articles in the field of artificial intelligence (AI). The scope covers all artificial intelligence area and its application in the following topics: neural networks; fuzzy logic; simulated biological evolution algorithms (like ...

Article Info

Abstract

Large language models for pattern recognition in text data

Article Info

Abstract