Wan Nural Jawahir Wan Yussof
Universiti Malaysia Terengganu

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

A computational analysis of short sentences based on ensemble similarity model Arifah Che Alhadi; Aziz Deraman; Masita Masila Abdul Jalil; Wan Nural Jawahir Wan Yussof; Rosmayati Mohemad
International Journal of Electrical and Computer Engineering (IJECE) Vol 9, No 6: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (268.997 KB) | DOI: 10.11591/ijece.v9i6.pp5386-5394

Abstract

The rapid development of Internet along with the wide use of social media applications produce huge volume of unstructured data in short text form such as tweets, text snippets and instant messages. This form of data rarely contains repeated word. It presents challenge in sentences similarity analysis as the standard text similarity models merely rely on the number of word occurrence, often resulting unreliable similarity value. Besides, the use of abbreviation, acronyms, slang, smiley, jargon, symbol or non-standard short form also contributes to the difficulty in similarity analysis. Thus, an extended ensemble similarity model approach is proposed. An experimental study has been conducted using datasets of English short sentences. The findings are very encouraging in improving the similarity value for short sentences.