PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND OFFICIAL STATISTICS
Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St

Automated Indonesian Text Augmentation with Web-Based Application Using Flask Framework

Iftitah Athiyyah Rahma (Politeknik Statistika STIS)
Lya Hulliyyatus Suadaa (Politeknik Statistika STIS)



Article Info

Publish Date
29 Dec 2023

Abstract

In real world, data and resources available for text classification are limited. One of issues on labelled data is imbalanced data. Problem of imbalanced data affects performance and accuracy of model because the model only focuses on data with majority label. Therefore, the measure of model accuracy cannot describe the true quality of model. To overcome this, an oversampling approach is carried out. Text-based oversampling is known as text augmentation. However, NLP resources for Indonesian, especially in performing text augmentation, are still limited. Therefore, this research conducts development of a web application to augment Indonesian text automatically. The application was bulit using prototype method. The application was successfully built and can facilitate users to perform augmentation automatically for all texts in the dataset. Users can select preferred augmentation technique and are required to upload datasets as input. The output of application is same dataset file as input with an additional column containing synthetic text augmented by the application. This application can contribute to further research in performing text augmentation for Indonesians.

Copyrights © 2023






Journal Info

Abbrev

icdsos

Publisher

Subject

Computer Science & IT

Description

International Conference on Data Science and Official Statistics International Conference on Data Science and Official Statistics (ICDSOS) 2023 is organized by Politeknik Statistika STIS and Statistics Indonesia (BPS). This international conference in collaboration with Forum Pendidikan Tinggi ...