Optical character recognition (OCR) is a technological process that converts diverse document formats into editable and searchable data. Recognition of Telugu characters through OCR poses a challenge because of compound characters. Identifying handwritten Telugu text proves difficult due to the substantial number of characters, their similarities, and overlapping forms. To handle overlapping characters, we implemented a segmentation algorithm that efficiently separates these characters, consequently enhancing the model’s accuracy. Feature extraction is a crucial phase in recognizing a broader range of characters, especially those that are similar in appearance. So, we have employed a light weighted ResNet 34 model that effectively addresses these challenges and handles deep networks without declining accuracy as the network’s depth increases. We have achieved a word level recognition rate of 81.5%. In addition, the parameters required by the model are less when compared to its counterpart inception V1, making it computationally efficient.
Copyrights © 2024