International Journal of Electrical and Computer Engineering
Vol 14, No 6: December 2024

Homonym and polysemy approaches with morphology extraction in weighting terms for Indonesian to English machine translation

Harjo, Budi (Unknown)
Muljono, Muljono (Unknown)
Abdullah, Rachmad (Unknown)



Article Info

Publish Date
01 Dec 2024

Abstract

Homonym and polysemy features can influence some errors in translation from a source language to another target language, for example, from Indonesian to English. A lemma or a morphology factor can cause the configuration of Indonesian homonym features. For example, the word beruang can mean an animal beruang (bear) and can mean a verb alternation ber+uang (has/have money). The Indonesian polysemy feature can also impact an error in the translation process because it can have a literal meaning and a symbolic meaning. For example, the terms bunga melati (jasmine flower) and bunga hati (lover), where bunga does not only mean flower. Therefore, the development machine translation (MT) method needs to capture homonym and polysemy features in the form of a word or a phrase. This research proposes homonym and polysemy approaches with morphology extraction in weighting terms for Indonesian to English MT. First, this research uses morphology extraction to detect sentences that contain prefixes, lemma, and suffixes. Then, the word similarity measurement functions to extract homonym and polysemy in the form of uni-gram and bi-gram using bidirectional encoder representations from transformers (BERT) embedding, named entity recognition (NER), synonym-based term expansion, and semantic similarity. This research uses neural machine translation for the translation process.

Copyrights © 2024






Journal Info

Abbrev

IJECE

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

International Journal of Electrical and Computer Engineering (IJECE, ISSN: 2088-8708, a SCOPUS indexed Journal, SNIP: 1.001; SJR: 0.296; CiteScore: 0.99; SJR & CiteScore Q2 on both of the Electrical & Electronics Engineering, and Computer Science) is the official publication of the Institute of ...