Sinulingga, Geraldo Benedictus
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Pembangunan Dataset Sintetis Klasifikasi Baku Lapangan Usaha Indonesia 2020 dengan Generative Artificial Intelligence Silmi Kaffah, M. Ihsan; Rahman, Dimas Haafizh; Amnur, Muh. Alfian; Montolalu, Cloudya Qashwah; Siregar, Amir Mumtaz; Sinulingga, Geraldo Benedictus; Ayu Alistin, Zharifah Dhiya; Raihannur, Cut Indah; Putri Arivia, Anggi Marya; Rahmawati, Arih; Nauli Sihombing, Fiona Audia; Salsabiela, Rahmadika Kemala; Bahy, Sabastian Alfons; Suadaa, Lya Hulliyyatus; Choir, Achmad Syahrul
Seminar Nasional Official Statistics Vol 2025 No 1 (2025): Seminar Nasional Official Statistics 2025
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/semnasoffstat.v2025i1.2581

Abstract

The limited quality datasets is a fundamental challenge in developing automatic classification of business description into the Indonesia Standard Industrial Classification (KBLI) using machine learning models. This research aims to develop a synthetic KBLI dataset using Generative AI via ChatGPT chatbot with a one-shot prompting technique. This technique is employed to generate business descriptions based on five-digit KBLI codes in order to address the limitations of labeled data and the variability of existing business descriptions. The dataset generated through prompt engineering and manual validation shows that 93,25% of the business descriptions align with the established KBLI standards. The average number of business descriptions per category demonstrates a fairly uniform distribution, ensuring sufficient representation for each five-digit code. This research makes a significant contribution in providing a dataset for training machine learning models in the automatic classification of business descriptions into the five-digit KBLI categories.