Journal of Technology Informatics and Engineering
Vol. 4 No. 3 (2025): DECEMBER | JTIE : Journal of Technology Informatics and Engineering

Zero-Shot Learning For Multilingual Document Classification In Low-Resource Languages

Orinos, Nasios (Unknown)
Onola, Quedevo (Unknown)
Chistoff, Ong Ben (Unknown)



Article Info

Publish Date
05 Dec 2025

Abstract

Document classification in low-resource languages remains a critical challenge due to the scarcity of annotated datasets, language-specific resources, and linguistic tools. This study investigates the effectiveness of zero-shot learning (ZSL) for multilingual document classification, with a specific focus on low-resource Southeast Asian languages: Javanese, Sundanese, and Malay. We adopt a zero-shot cross-lingual transfer approach, using English-labeled data as the source domain and evaluating on unseen target-language documents without any supervised fine-tuning. Specifically, we employ two state-of-the-art multilingual transformer models, XLM-RoBERTa (XLM-R) and Multilingual T5 (mT5), to evaluate their ability to generalize across linguistically distant languages. Experimental results show that XLM-R achieves higher average accuracy (≈78%) and F1 Score (≈0.76) than mT5 (≈74% accuracy, 0.72 F1), demonstrating stronger transferability and stability. Both models exhibit efficient inference speed and manageable computational costs, indicating potential for deployment in resource-constrained environments. The findings introduce an early benchmark for zero-shot multilingual document classification in Southeast Asian languages and highlight the feasibility of inclusive NLP systems that bridge the data gap for underrepresented linguistic communities.

Copyrights © 2025






Journal Info

Abbrev

jtie

Publisher

Subject

Computer Science & IT

Description

Power Engineering Telecommunication Engineering Computer Engineering Control and Computer Systems Electronics Information technology Informatics Data and Software engineering Biomedical ...