IAES International Journal of Artificial Intelligence (IJ-AI)
Vol 2, No 3: September 2013

Unknown Word Detection via Syntax Analyzer

Soe Lai Phyue (University of Computer Studies, Mandalay)



Article Info

Publish Date
01 Jun 2013

Abstract

A knowledge resource is the central repository of data for all Natural Language Processing (NLP) applications and development of NLP applications mostly depend on coverage of knowledge resources. The multipurpose Myanmar Language Lexico-conceptual Knowledge Resource (ML2KR) and Myanmar function tagged corpus were developed as initial resources by using semiautomatic approach. ML2KR consists of Myanmar WordNet, Myanmar English bilingual computational lexicon and morphological processor. Myanmar language is morphologically rich and agglutinative language. Therefore, it is usually required to segment Myanmar texts prior to further processing. Segmentation has two main problems, word ambiguity that more than one meaning and unknown word occurrence that a word does not have in the lexicon. In this paper, we address on the unknown word occurrence issue. To detect the new unrestricted character patterns of words, character based parsing syntax analyzer is built by using Context Free Grammar (CFG). Firstly, unknown words are considered as a Name by Name Entity Recognition with forward and backward rule based approach. If the name does not agree with syntax analyzer, all possible unknown words are verified to update the lexicon and Myanmar WordNet.DOI: http://dx.doi.org/10.11591/ij-ai.v2i3.1802

Copyrights © 2013






Journal Info

Abbrev

IJAI

Publisher

Subject

Computer Science & IT Engineering

Description

IAES International Journal of Artificial Intelligence (IJ-AI) publishes articles in the field of artificial intelligence (AI). The scope covers all artificial intelligence area and its application in the following topics: neural networks; fuzzy logic; simulated biological evolution algorithms (like ...