Machine translation (MT) has evolved from simple rule-based systems to advanced neural and large-language-model (LLM) frameworks. This evolution has enabled increasingly human-like translation performance. However, very rare studies were done to evaluate them for diplomatic texts. Adopting mixed-method design, this study compared the ability of three translation models, namely Google Translate, DeepL, and ChatGPT in translating diplomatic text (excerptx from Thailand's governmental address and The Inaugural Address of President Trump). The quantitative methods included the use of BLUE and TER. The qualitative method employed qualitative discourse analysis and diplomatic register analysis. The quantitative results indicate notable differences in performance across the three machine translation systems. Google Translate obtained a BLEU score of 58.59 and a TER of 22.22%, suggesting that while its translation output retains general meaning and readability, it tends to be more literal and less refined in terms of lexical and syntactic choices. DeepL achieved a higher BLEU score of 64.67 and a lower TER of 18.52%, demonstrating stronger performance in terms of fluency, grammatical accuracy, and lexical precision. ChatGPT, with a BLEU score of 98.00 and TER of 4.00%, shows near-human performance, effectively functioning as the reference baseline. From the qualitative analysis, it was found that ChatGPT best satisfies the linguistic, pragmatic, and rhetorical requirements of diplomatic translation, followed by DeepL as a strong secondary option, and Google Translate as a useful tool for everyday general-purpose translation but not suitable for formal governmental or international contexts.
Copyrights © 2025