Prakash, Puneeth
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

CRNN model for text detection and classification from natural scenes Prakash, Puneeth; Yeliyur Hanumanthaiah, Sharath Kumar; Bannur Mayigowda, Somashekhar
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 1: March 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i1.pp839-849

Abstract

In the emerging field of computer vision, text recognition in natural settings remains a significant challenge due to variables like font, text size, and background complexity. This study introduces a method focusing on the automatic detection and classification of cursive text in multiple languages: English, Hindi, Tamil, and Kannada using a deep convolutional recurrent neural network (CRNN). The architecture combines convolutional neural networks (CNN) and long short-term memory (LSTM) networks for effective spatial and temporal learning. We employed pre-trained CNN models like VGG-16 and ResNet-18 for feature extraction and evaluated their performance. The method outperformed existing techniques, achieving an accuracy of 95.0%, 96.3%, and 96.2% on ICDAR 2015, ICDAR 2017, and a custom dataset (PDT2023), respectively. The findings not only push the boundaries of text detection technology but also offer promising prospects for practical applications.
A comparative analysis of optical character recognition models for extracting and classifying texts in natural scenes Prakash, Puneeth; Yeliyur Hanumanthaiah, Sharath Kumar
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 2: April 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i2.pp1290-1301

Abstract

This research introduces prior-guided dynamic tunable network (PDTNet), an efficient model designed to improve the detection and recognition of text in complex environments. PDTNet’s architecture combines advanced preprocessing techniques and deep learning methods to enhance accuracy and reliability. The study comprehensively evaluates various optical character recognition (OCR) models, demonstrating PDTNet’s superior performance in terms of adaptability, accuracy, and reliability across different environmental conditions. The results emphasize the need for a context-aware approach in selecting OCR models for specific applications. This research advocates for the development of hybrid OCR systems that leverages multiple models, aiming to arrive at a higher accuracy and adaptability in practical applications. With a precision of 85%, the proposed model showed an improved performance of 1.7% over existing state of the arts model. These findings contribute valuable insights into addressing the technical challenges of text extraction and optimizing OCR model selection for real-world scenarios.