While conventional optical character recognition (OCR) systems can digitize text, they struggle with diverse handwriting styles, noisy inputs, and unstructured layouts, limiting their effectiveness. This study proposes enhanced large language model whisperer (ELLMW), a vision–language framework for accurate text extraction (TE) from fully handwritten scripts. The methodology integrates advanced preprocessing (noise reduction, binarization, and skew correction), deep learning–based handwriting recognition convolutional neural network–long short-term memory (CNN–LSTM), and LLM-based post-correction to ensure context-aware and structurally coherent outputs. The system converts scanned images, portable document formats (PDFs), and irregularly formatted answer sheets into machine-readable text, while automatically correcting errors in spelling, grammar, and layout. Experimental evaluation on a curated dataset of handwritten examination answer scripts (HEAS) demonstrates that ELLMW achieves 97.8% accuracy, 1.04%-character error rate (CER), and 3.24%-word error rate, outperforming widely used OCR tools including Tesseract, EasyOCR, Google Cloud Vision (GCV), PaddleOCR, ABBYY FineReader, and Transym OCR. The results highlight the model’s robustness across varying handwriting styles, noisy backgrounds, and complex document structures.
Copyrights © 2026