Building of Informatics, Technology and Science
Vol 6 No 2 (2024): September 2024

Optimasi Kombinasi Hyperparameter dan Augmentasi Korpus dalam Neural Machine Translation Bahasa Indonesia ke Bahasa Melayu Bengkulu

Soyusiawaty, Dewi (Unknown)



Article Info

Publish Date
30 Sep 2024

Abstract

Neural Machine Translation (NMT) with attention mechanism has become an effective approach in improving the quality of cross-language translation. However, the application of NMT with attention to regional or minority languages still faces challenges, especially in the context of Bengkulu Malay Language, a variant of Malay Language used in the Bengkulu Province, Indonesia. This research aims to enhance the translation accuracy from Indonesian to Bengkulu Malay Language through optimization of hyperparameter combinations in NMT models with attention. The research method involves experiments with various hyperparameter combinations, such as batch size, dataset size, and dropout rate, applied to NMT models with attention. Evaluation is conducted using the BLEU metric to measure translation quality. Corpus augmentation is done to obtain a larger corpus.The experimental results indicate that translation accuracy improvement can be achieved by selecting optimal hyperparameter combinations. The use of a larger dataset yields better performance compared to a smaller dataset. A batch size of 16 yields better results than batch sizes of 32 and 64, especially when used with a larger dataset. Additionally, a dropout rate of 0.8 tends to perform better than dropout rates of 0.2 and 0.5. Regarding epoch values, the research shows that increasing epochs up to a certain point (approximately 30 epochs) enhances model performance, but further increases tend to cause overfitting on the training data. This research provides a significant contribution to the development of machine translation for Bengkulu Malay Language and other regional languages. It is hoped that the findings of this research can serve as a foundation for further development in the field of machine translation for minority languages, as well as improving information accessibility in diverse language communities in Indonesia.

Copyrights © 2024






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...