Claim Missing Document
Check
Articles

Found 1 Documents
Search

An Enhanced U-Net-based Approach for Sinhala Document Layout Analysis Hulathdoowage S.K.D; Kumara B.T.G.S
Journal of Computers and Digital Business Vol. 4 No. 3 (2025)
Publisher : PT. Delitekno Media Madiri

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56427/jcbd.v4i3.767

Abstract

Document layout analysis plays a critical role in the digitization pipeline by identifying, segmenting, and classifying structural elements within documents to support accurate information extraction. This task becomes increasingly challenging when dealing with heterogeneous layouts that contain paragraphs, tables, figures, mathematical expressions, and other visual components. For Sinhala, a low-resource language with limited annotated datasets and specialized models, research in this area remains sparse. To address this gap, this study proposes an enhanced U-Net architecture that integrates convolutional neural networks with vision transformer blocks to improve semantic segmentation performance. The model leverages convolutional layers to capture fine-grained local features while employing transformer components to model long-range dependencies and global contextual relationships across document regions. A manually annotated dataset of 750 Sinhala document images covering 14 distinct element categories was developed to train and evaluate the model. Experimental results demonstrate that the proposed architecture significantly outperforms standard U-Net and attention U-Net variants, achieving 93.06% pixel accuracy, 64.37% mean IoU, and 77.32% mean F1-score. This research represents the first comprehensive document layout analysis framework tailored specifically for Sinhala documents and provides a strong foundation for future digitization, archival, and text processing initiatives within Sri Lankan academic, governmental, and cultural institutions.