Alfina, Ika
Faculty Of Computer Science Universitas Indonesia

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Javanese part-of-speech tagging using cross-lingual transfer learning Enrique, Gabriel; Alfina, Ika; Yulianti, Evi
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 3: September 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i3.pp3498-3509

Abstract

Large datasets that are publicly available for POS tagging do not always exist for some languages. One of those languages is Javanese, a local language in Indonesia, which is considered as a low-resource language. This research aims to examine the effectiveness of cross-lingual transfer learning for Javanese POS tagging by fine-tuning the state-of-the-art Transformer-based models (such as IndoBERT, mBERT, and XLM-RoBERTa) using different kinds of source languages that have a higher resource (such as Indonesian, English, Uyghur, Latin, and Hungarian languages), and then fine-tuning it again using the Javanese language as the target language. We found that the models using cross-lingual transfer learning can increase the accuracy of the models without using cross-lingual transfer learning by 14.3%–15.3% over LSTM-based models, and by 0.21%–3.95% over Transformer-based models. Our results show that the most accurate Javanese POS tagger model is XLM-RoBERTa that is fine-tuned in two stages (the first one using Indonesian language as the source language, and the second one using Javanese language as the target language), capable of achieving an accuracy of 87.65%
Poetry Generation for Indonesian Pantun: Comparison Between SeqGAN and GPT-2 Emmanuella Anggi Siallagan; Ika Alfina
Jurnal Ilmu Komputer dan Informasi Vol. 16 No. 1 (2023): Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Informatio
Publisher : Faculty of Computer Science - Universitas Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21609/jiki.v16i1.1113

Abstract

Pantun is a traditional Malay poem consisting of four lines: two lines of deliverance and two lines of messages. Each ending-line word in pantun forms an ABAB rhyme pattern. In this work, we compare the performance of Sequence Generative Adversarial Nets (SeqGAN) and Generative Pre-trained Transformer 2 (GPT-2) in automatically generating Indonesian pantun. We also created the first publicly available Indonesian pantun dataset that consists of 7.8K pantun. We evaluated how well each model produced pantun by its lexical richness and its formedness. We introduced the evaluation of pantun with two aspects: structure and rhyme. GPT-2 performs better with a margin of 29.40% than SeqGAN in forming the structure, 35.20% better in making rhyming patterns, and 0.04 difference in giving richer vocabulary to its generated pantun.