Fadzil Hassan, Mohd
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Malay phoneme-based subword news headline generator for low-resource language Tsann Phua, Yeong; Hooi Yew, Kwang; Fadzil Hassan, Mohd; Yok Wooi, Matthew Teow
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 4: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i4.pp4965-4975

Abstract

The booming of technology has significantly increased the amount of news articles for readers. The headline of news plays an essential role in attracting readers. Traditionally, crafting the news headline is a manual task at the news desk. The motivation of this paper is to address the issues faced in low resource languages, such as the Malay language. The main contribution of this paper is a new hybrid model based on extractive- and abstractive-based text summarization with the integration of a geographical linguistics model; a Malay phoneme-based subword embedding has been developed to solve the complex morphological issue in the Malay language-based computational linguistic applications. The experiment involves various sequence-to sequence (seq2seq) models to generate the Malay news headlines. Besides that, the out-of-vocabulary (OOV) is assessed in the models. From the experiment, the proposed hybrid text summarization model shows significant improvement over the baseline models above 11.00 in ROUGE-1, 4.00 ROUGE-2, and 11.00 in ROUGE-L. The proposed model can reduce the OOV rate to below 15%.