Claim Missing Document
Check
Articles

Found 3 Documents
Search

Peramalan Produksi Perikanan Laut di Provinsi Jawa Tengah: Pendekatan Statistik dan Machine Learning Amnur, Muh. Alfian; Pramana, Setia
Seminar Nasional Official Statistics Vol 2025 No 1 (2025): Seminar Nasional Official Statistics 2025
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/semnasoffstat.v2025i1.2417

Abstract

Fisheries production in Central Java Province experiences seasonal fluctuations that affect supply stability and fishermen's income. This study aims to analyze the production trends from 2013 to 2023 and compare the performance of the SARIMA and Random Forest models in forecasting fishery production sold at Fish Auction Sites (TPI). Based on evaluation metrics including MAE, RMSE, and MAPE, the SARIMA(8,1,1)(1,1,0)[12] model demonstrated the best performance with values of 2930.12, 3749.83, and 15.40, respectively. Additionally, the SARIMA model was used to forecast production for January 2024, resulting in an estimated output of 26,210.63 tons. This forecast is expected to assist stakeholders in monitoring fishery production in Central Java Province.
Pembangunan Dataset Sintetis Klasifikasi Baku Lapangan Usaha Indonesia 2020 dengan Generative Artificial Intelligence Silmi Kaffah, M. Ihsan; Rahman, Dimas Haafizh; Amnur, Muh. Alfian; Montolalu, Cloudya Qashwah; Siregar, Amir Mumtaz; Sinulingga, Geraldo Benedictus; Ayu Alistin, Zharifah Dhiya; Raihannur, Cut Indah; Putri Arivia, Anggi Marya; Rahmawati, Arih; Nauli Sihombing, Fiona Audia; Salsabiela, Rahmadika Kemala; Bahy, Sabastian Alfons; Suadaa, Lya Hulliyyatus; Choir, Achmad Syahrul
Seminar Nasional Official Statistics Vol 2025 No 1 (2025): Seminar Nasional Official Statistics 2025
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/semnasoffstat.v2025i1.2581

Abstract

The limited quality datasets is a fundamental challenge in developing automatic classification of business description into the Indonesia Standard Industrial Classification (KBLI) using machine learning models. This research aims to develop a synthetic KBLI dataset using Generative AI via ChatGPT chatbot with a one-shot prompting technique. This technique is employed to generate business descriptions based on five-digit KBLI codes in order to address the limitations of labeled data and the variability of existing business descriptions. The dataset generated through prompt engineering and manual validation shows that 93,25% of the business descriptions align with the established KBLI standards. The average number of business descriptions per category demonstrates a fairly uniform distribution, ensuring sufficient representation for each five-digit code. This research makes a significant contribution in providing a dataset for training machine learning models in the automatic classification of business descriptions into the five-digit KBLI categories.
Business Description Categorization to the Five-Digit Indonesian Standard Classification of Business Field (KBLI) Using Machine Learning and Transfer Learning Amnur, Muh. Alfian; Muhammad Gazali, La Ode; Mumtaz Siregar, Amir; Ariya Jalaksana, Faruq; Nisa Rahayu Ananda Suwendra, Made; Fadila Utami, Nurul; Median Ramadhan, Alif; Krisela Fabrianne, Elisse; Wirata Raja Panjaitan, Eurorea; Aini Izzati, Fitri; Bintang Yuliani Manalu, Jernita; Gilang Hidayat, Muhammad; Hulliyyatus Suadaa, Lya; Yuniarto, Budi; Pramana, Setia
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2025i1.719

Abstract

The Indonesian Standard Classification of Business Fields (KBLI) is essential for economic statistics, yet manual classification of business descriptions to five-digit KBLI codes is time-consuming and prone to inconsistencies. This study aims to develop and compare machine learning (Support Vector Machine and Random Forest) and transfer learning  (IndoBERT) models for automating KBLI classification, supported by the preparation of synthetic and real-world datasets for model training. The synthetic data were generated using large language models, validated through human majority voting and complemented with realworld data from the National Labor Force Survey (Sakernas) and the Micro and Small Industry Survey (IMK). The findings indicate that Fine-tuned IndoBERT achieved superior performance, achieving an F1-score of 92.99% and an accuracy of 93.40% on synthetic data, alongside top-1, top-5, and top-10 accuracies of 32.93%, 54.71%, and 63.24% on real-world data. The deployment of fine-tuned IndoBERT as a RESTful API demonstrates its scalability and efficiency, presenting a reliable solution for large-scale KBLI classification in official statistics.