ComEngApp : Computer Engineering and Applications Journal
Vol 3 No 2 (2014)

Words Stemming Based on Structural and Semantic Similarity

Mohammad Hassan Dianati (Shiraz University)
Mohammad Hadi Sadreddini (Shiraz University)
Amir Hossein Rasekh (Shiraz University)
Seyed Mostafa Fakhrahmad (Shiraz University)
Hossein Taghi-Zadeh (Shiraz University)



Article Info

Publish Date
23 Jun 2014

Abstract

Words  stemming  is  one  of  the  important  issues  in  the field  of  natural  language processing  and  information retrieval.  There  are  different  methods  for stemming which are mostly language-dependent. Therefore, these  stemmers are only applicable  to  particular  languages.  Because  of the importance  of  this issue,  in  this paper, the proposed method for stemming is aimed to be language-independent. In the  proposed  stemmer,  a  bilingual  dictionary  is  used and  all  of  the  words  in  the dictionary are firstly clustered. The words’ clustering is based on their structural and semantic similarity. Finally, finding the stem of new coming words is performed by making use of the previously formatted clusters. To evaluate the proposed scheme, words  stemming is  done on both  Persian  and  English  languages.  The encouraging results  indicate  the  good  performance  of  the proposed  method  compared  with  its counterparts.

Copyrights © 2014






Journal Info

Abbrev

comengapp

Publisher

Subject

Computer Science & IT Engineering

Description

ComEngApp-Journal (Collaboration between University of Sriwijaya, Kirklareli University and IAES) is an international forum for scientists and engineers involved in all aspects of computer engineering and technology to publish high quality and refereed papers. This Journal is an open access journal ...