Journal of Information Technology and Computer Science
Vol. 5 No. 3: Desember 2020

Utilizing Indonesian Universal Language Model Fine-tuning for Text Classification

Bunyamin, Hendra (Unknown)



Article Info

Publish Date
25 Jan 2021

Abstract

Inductive transfer learning technique has made a huge impact on the computer vision field. Particularly, computer vision  applications including object detection, classification, and segmentation, are rarely trained from scratch; instead, they are fine-tuned from pretrained models, which are products of learning from huge datasets. In contrast to computer vision, state-of-the-art natural language processing models are still generally trained from the ground up. Accordingly, this research attempts to investigate an adoption of the transfer learning technique for natural language processing. Specifically, we utilize a transfer learning technique called Universal Language Model Fine-tuning (ULMFiT) for doing an Indonesian news text classification task. The dataset for constructing the language model is collected from several news providers from January to December 2017 whereas the dataset employed for text classification task comes from news articles provided by the Agency for the Assessment and Application of Technology (BPPT). To examine the impact of ULMFiT, we provide a baseline that is a vanilla neural network with two hidden layers. Although the performance of ULMFiT on validation set is lower than the one of our baseline, we find that the benefits of ULMFiT for the classification task significantly reduce the overfitting, that is the difference between train and validation accuracies from 4% to nearly zero.

Copyrights © 2020






Journal Info

Abbrev

jitecs

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering

Description

The Journal of Information Technology and Computer Science (JITeCS) is a peer-reviewed open access journal published by Faculty of Computer Science, Universitas Brawijaya (UB), Indonesia. The journal is an archival journal serving the scientist and engineer involved in all aspects of information ...