Zeindri Abu Umar
Institut Agama Islam Negeri Sultan Amai Gorontalo

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Optimalisasi Teknologi OCR dalam Digitalisasi Manuskrip Hadis: Studi Akurasi dan Tantangan Linguistik Arab Klasik Zeindri Abu Umar; Mahmud Yunus
Hamidah: Jurnal Ilmu Hadis Vol. 1 No. 1 (2025): Hadith and Digitalization
Publisher : Yayasan Albahriah Jamiah Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.64691/6vn20070

Abstract

The digitization of hadith manuscripts is a strategic step in preserving and disseminating Islamic intellectual treasures. Still, this process faces significant challenges related to the accuracy of classical Arabic text recognition using Optical Character Recognition (OCR) technology. The complexity of Arabic orthography, the diversity of calligraphic styles, and the presence of diacritics and ligatures typical of classical manuscripts often lead to errors in character recognition, which impact the reliability of the digitization results. This study aims to analyze the accuracy of various OCR systems in reading hadith manuscripts, identify the main linguistic constraints that cause distortions in letter and word recognition, and develop an optimization model based on linguistic and computational integration to improve system performance. Using a mixed methods approach, this study combines quantitative analysis of text processing results from several popular OCR platforms with qualitative analysis of linguistic error patterns that appear in classical Arabic manuscripts. The results show that the accuracy of OCR for hadith manuscripts varies widely, ranging from 68% to 91%, depending on image quality, calligraphy type, and the system’s ability to recognize morphological forms and punctuation typical of classical Arabic. The most common errors occur in letters with graphemic similarities, such as “س” and “ش,” as well as in words with full vowels or complex Iʻrāb markings. An optimization model developed through a combination of machine learning and Arabic morphological analysis has been shown to improve accuracy by up to 94%, while accelerating the post-correction process for digital texts. In conclusion, the digitization of hadith manuscripts requires an integrative approach between OCR technology and an understanding of classical Arabic linguistics to ensure the validity, efficiency, and sustainability of digital religious manuscript preservation.