Venkatesh, Dhivya
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

ELLMW: an enhanced vision–language model for reliable text extraction from manually composed scripts Venkatesh, Dhivya; Sivaraj, Brintha Rajakumari
International Journal of Reconfigurable and Embedded Systems (IJRES) Vol 15, No 1: March 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijres.v15.i1.pp194-203

Abstract

While conventional optical character recognition (OCR) systems can digitize text, they struggle with diverse handwriting styles, noisy inputs, and unstructured layouts, limiting their effectiveness. This study proposes enhanced large language model whisperer (ELLMW), a vision–language framework for accurate text extraction (TE) from fully handwritten scripts. The methodology integrates advanced preprocessing (noise reduction, binarization, and skew correction), deep learning–based handwriting recognition convolutional neural network–long short-term memory (CNN–LSTM), and LLM-based post-correction to ensure context-aware and structurally coherent outputs. The system converts scanned images, portable document formats (PDFs), and irregularly formatted answer sheets into machine-readable text, while automatically correcting errors in spelling, grammar, and layout. Experimental evaluation on a curated dataset of handwritten examination answer scripts (HEAS) demonstrates that ELLMW achieves 97.8% accuracy, 1.04%-character error rate (CER), and 3.24%-word error rate, outperforming widely used OCR tools including Tesseract, EasyOCR, Google Cloud Vision (GCV), PaddleOCR, ABBYY FineReader, and Transym OCR. The results highlight the model’s robustness across varying handwriting styles, noisy backgrounds, and complex document structures.