Garuda - Garba Rujukan Digital

BAREKENG: Jurnal Ilmu Matematika dan Terapan

Vol 20 No 3 (2026): BAREKENG: Journal of Mathematics and Its Application

Uswatun Hasanah (Statistics and Data Science, School of Data Science, Mathematics, and Informatics, IPB University, Indonesia)
Agus Mohamad Soleh (Statistics and Data Science, School of Data Science, Mathematics, and Informatics, IPB University, Indonesia)
Cici Suhaeni (Statistics and Data Science, School of Data Science, Mathematics, and Informatics, IPB University, Indonesia)
Anwar Fitrianto (Statistics and Data Science, School of Data Science, Mathematics, and Informatics, IPB University, Indonesia)

Publish Date
08 Apr 2026

Multiclass text classification remains a difficult task, primarily due to semantic ambiguity and differences in input length. This study evaluates RoBERTa and GPT-based models for multiclass text classification, focusing on how prompting strategies and document length affect accuracy and robustness. Experiments were conducted using the OSDG Community Dataset, which contains approximately 15,000 labeled samples. The dataset was partitioned into four subsets based on input length: short, medium, long, and all combined. Three GPT variants (zero-shot, few-shot, and fine-tuned) were compared against a RoBERTa baseline. Fine-tuning was implemented via OpenAI’s supervised API with prompt-response formatting. Performance was assessed through F1-score, precision, recall, and balanced accuracy. Fine-tuned GPT achieved the strongest results in all settings, with a macro F1-score of 0.9204 on the all-combined dataset, representing a 4.61% improvement over RoBERTa. Consistent gains were also observed across short (8.63%), medium (3.83%), and long (20.31%) texts. The largest improvement occurred on long documents, while medium-length inputs provided the most stable performance across models. These findings highlight the effectiveness of task-specific fine-tuning in enhancing GPT’s capability to classify SDG-related texts across diverse input lengths.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

BAREKENG: Jurnal Ilmu Matematika dan Terapan

Website

Abbrev

barekeng

Publisher

Universitas Pattimura

Subject

Computer Science & IT Control & Systems Engineering Economics, Econometrics & Finance Energy Engineering Mathematics Mechanical Engineering Physics Transportation

Description

BAREKENG: Jurnal ilmu Matematika dan Terapan is one of the scientific publication media, which publish the article related to the result of research or study in the field of Pure Mathematics and Applied Mathematics. Focus and scope of BAREKENG: Jurnal ilmu Matematika dan Terapan, as follows: - Pure ...

Article Info

Abstract

EVALUATING ROBERTA AND GPT-BASED MODELS FOR SDG MULTICLASS TEXT CLASSIFICATION ACROSS DIFFERENT DOCUMENT LENGTHS

Article Info

Abstract