IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
Vol 19, No 2 (2025): April

Exploring the Impact of Back-Translation on BERT's Performance in Sentiment Analysis of Code-Mixed Language Data

Setiono, Nisrina Hanifa (Unknown)
Sari, Yunita (Unknown)



Article Info

Publish Date
30 Apr 2025

Abstract

Social media, particularly Twitter, has become a key platform for communication and opinion-sharing, where code mixing, the blending of multiple languages in a single sentence, is common. In Indonesia, Indonesian-English code mixing is widely used, especially in urban areas. However, sentiment analysis on code-mixed text poses challenges in natural language processing (NLP) due to the informal nature of the data and the limitations of models trained on formal text. This study applies back translation to address these challenges and optimize BERT-based sentiment analysis. The method is tested on the INDONGLISH dataset, consisting of 5,067 labeled tweets. Results show that applying back translation directly to raw tweets yields better performance by preserving original meaning, improving model accuracy. However, when back translation follows monolingual translation, accuracy declines due to semantic distortions. Repeated translation modifies sentence structure and sentiment labels, reducing reliability. These findings indicate that each additional translation step risks decreasing sentiment analysis accuracy, particularly for code-mixed datasets, which are highly sensitive to linguistic shifts. Back translation proves to be an effective approach for formalizing data while maintaining contextual integrity, enhancing sentiment analysis performance on code-mixed text

Copyrights © 2025






Journal Info

Abbrev

ijccs

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Indonesian Journal of Computing and Cybernetics Systems (IJCCS), a two times annually provides a forum for the full range of scholarly study . IJCCS focuses on advanced computational intelligence, including the synergetic integration of neural networks, fuzzy logic and eveolutionary computation, so ...