Claim Missing Document
Check
Articles

Peramalan Produksi Perikanan Laut di Provinsi Jawa Tengah: Pendekatan Statistik dan Machine Learning Amnur, Muh. Alfian; Pramana, Setia
Seminar Nasional Official Statistics Vol 2025 No 1 (2025): Seminar Nasional Official Statistics 2025
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/semnasoffstat.v2025i1.2417

Abstract

Fisheries production in Central Java Province experiences seasonal fluctuations that affect supply stability and fishermen's income. This study aims to analyze the production trends from 2013 to 2023 and compare the performance of the SARIMA and Random Forest models in forecasting fishery production sold at Fish Auction Sites (TPI). Based on evaluation metrics including MAE, RMSE, and MAPE, the SARIMA(8,1,1)(1,1,0)[12] model demonstrated the best performance with values of 2930.12, 3749.83, and 15.40, respectively. Additionally, the SARIMA model was used to forecast production for January 2024, resulting in an estimated output of 26,210.63 tons. This forecast is expected to assist stakeholders in monitoring fishery production in Central Java Province.
An Intelligent Conversational Agent Using Self-Reflective Retrieval-Augmented Generation for Enhanced Large Language Model Support in National Accounts Learning Farhan, Muhammad; ., Yunofri; Tasriah, Etjih; Hulliyyatus Suadaa, Lya; Pramana, Setia
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2025i1.575

Abstract

BPS Statistics Indonesia plays a strategic role in compiling balance sheet statistics as the foundation for national policy analysis. This role requires a deep understanding of the concepts, definitions, and compilation standards outlined in the System of National Accounts (SNA) manual. However, in practice, comprehending such complex technical documents is not always straightforward. To address this challenge, this study proposes the development of an intelligent conversational agent in the form of a chatbot that implements the Self-Multimodal RAG approach. This approach integrates self-reflection mechanisms to generate more accurate and relevant responses. The evaluation was conducted using the LLM-as-a-Judge framework across four metrics: answer correctness, answer relevancy, context relevancy, and context faithfulness. Experimental results demonstrate that the Self-Reflective RAG achieved a score of 80% on the answer correctness metric, with competitive performance in terms of relevancy and faithfulness. From the chatbot implementation perspective, black-box testing confirmed that all functionalities operated as expected, while system usability testing using the CSUQ instrument yielded a score of 74.704%, indicating that the chatbot is well-accepted by users.
Business Description Categorization to the Five-Digit Indonesian Standard Classification of Business Field (KBLI) Using Machine Learning and Transfer Learning Amnur, Muh. Alfian; Muhammad Gazali, La Ode; Mumtaz Siregar, Amir; Ariya Jalaksana, Faruq; Nisa Rahayu Ananda Suwendra, Made; Fadila Utami, Nurul; Median Ramadhan, Alif; Krisela Fabrianne, Elisse; Wirata Raja Panjaitan, Eurorea; Aini Izzati, Fitri; Bintang Yuliani Manalu, Jernita; Gilang Hidayat, Muhammad; Hulliyyatus Suadaa, Lya; Yuniarto, Budi; Pramana, Setia
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2025i1.719

Abstract

The Indonesian Standard Classification of Business Fields (KBLI) is essential for economic statistics, yet manual classification of business descriptions to five-digit KBLI codes is time-consuming and prone to inconsistencies. This study aims to develop and compare machine learning (Support Vector Machine and Random Forest) and transfer learning  (IndoBERT) models for automating KBLI classification, supported by the preparation of synthetic and real-world datasets for model training. The synthetic data were generated using large language models, validated through human majority voting and complemented with realworld data from the National Labor Force Survey (Sakernas) and the Micro and Small Industry Survey (IMK). The findings indicate that Fine-tuned IndoBERT achieved superior performance, achieving an F1-score of 92.99% and an accuracy of 93.40% on synthetic data, alongside top-1, top-5, and top-10 accuracies of 32.93%, 54.71%, and 63.24% on real-world data. The deployment of fine-tuned IndoBERT as a RESTful API demonstrates its scalability and efficiency, presenting a reliable solution for large-scale KBLI classification in official statistics. 
Sentiment Analysis on Overseas Tweets on the Impact of COVID-19 in Indonesia Simanjuntak, Tigor Nirman; Pramana, Setia
Indonesian Journal of Statistics and Applications Vol 5 No 2 (2021)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v5i2p304-313

Abstract

This study aims to conduct analysis to determine the trend of sentiment on tweets about Covid-19 in Indonesia from the Twitter accounts overseas on big data perspective. The data was obtained from Twitter in the period of April 2020, with the word query "Indonesian Corona Virus" from foreign user accounts in English. The process of retrieving data comes from Twitter tweets by crawling the text using Twitter's API (Application Programming Interface) by employing Python programming language. Twitter was chosen because it is very fast and easy to spread through status updates from and among the user accounts. The number of tweets obtained was 8,740 in text format, with a total engagement of 217,316. The data was sorted from the tweets with the largest to smallest engagement, then cleaned from unnecessary fonts and symbols as well as typo words and abbreviations. The sentiment classification was carried out by analytical tools, extracting information with text mining, into positive, negative, and neutral polarity. To sharpen the analysis, the cleaned data was selected only with the largest engagement until those with 100 engagements; then was grouped into 30 sub-topics to be analyzed. The interesting facts are found that most tweets and sub-topics were dominated by the negative sentiment; and some unthinkable sub-topics were talked by many users.
Development of Automated Environmental Data Collection System and Environment Statistics Dashboard Paramartha, Dede Yoga; Fitriyani, Ana Lailatul; Pramana, Setia
Indonesian Journal of Statistics and Applications Vol 5 No 2 (2021)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v5i2p314-325

Abstract

Environmental data such as pollutants, temperature, and humidity are data that have a role in the agricultural sector in predicting rainfall conditions. In fact, pollutant data is common to be used as a proxy to see the density of industry and transportation. With this need, it is necessary to have automated data from outside websites that are able to provide data faster than satellite confirmation. Data sourced from IQair, can be used as a benchmark or confirmative data for weather and environmental statistics in Indonesia. Data is taken by scraping method on the website. Scraping is done on the API available on the website. Scraping is divided into 2 stages, the first is to determine the location in Indonesia, the second is to collect statistics such as temperature, humidity, and pollutant data (AQI). The module used in python is the scrapy module, where the crawling is effective starting from May 2020. The data is recorded every three hours for all regions of Indonesia and directly displayed by the Power BI-based dashboard. We also illustrated that AQI data can be used as a proxy for socio-economic activity and also as an indicator in monitoring green growth in Indonesia.
Online Marketplace Data to Figure COVID-19 Impact on Micro and Small Retailers in Indonesia Larasati, Dhiar Niken; Bustaman, Usman; Pramana, Setia
Indonesian Journal of Statistics and Applications Vol 5 No 2 (2021)
Publisher : Statistics and Data Science Program Study, SSMI, IPB University, in collaboration with the Forum Pendidikan Tinggi Statistika Indonesia (FORSTAT) and the Ikatan Statistisi Indonesia (ISI)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29244/ijsa.v5i2p333-342

Abstract

The COVID-19 outbreak is not only talking about health crises but also social and economic crises all over the world. In Indonesia, the outbreak has shaken almost all business sectors, however it seems to bring a silver lining for e-commerce sectors since the pandemic has developed online shopping habits. During the pandemic, the impact of COVID-19 on the Indonesian economy needs to be updated from time to time to be used on quick policymaking. Therefore, big data plays an important role to provide the information relatively fast. This paper aims to describe how big data i.e., marketplace data, could be used to figure the impact of COVID-19 outbreak on micro and small retailers in Indonesia. The dataset was collected regularly from a marketplace website in Indonesia from January to June 2020. To see the changing of sales during the COVID-19 period, the sales before and after social distancing policy implementation are compared. The result showed that the online marketplace in Indonesia is dominated by micro retailers based on the number of products sold in the marketplace. The total revenue of micro retailers gives a significant increase during the pandemic. Whereas for medium retailers, the increase in total revenue is seen to be lower than micro retailers’ total revenue. It indicates a positive sign for the growth of micro retailers in the online marketplace.
Optimized AIS-Derived Indicators through Application of Port Administrative Boundaries Nastiar, Gery; Pramana, Setia; Krismawati, Dewi
Jurnal Penelitian Transportasi Laut Vol. 27 No. 2 (2025): Jurnal Penelitian Transportasi Laut
Publisher : Sekretariat Badan Kebijakan Transportasi, Formerly by Puslitbang Transportasi Laut, Sungai, Danau, dan Penyeberangan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25104/transla.v27i2.2429

Abstract

Defining port boundaries for Automatic Identification System (AIS) data processing is challenging, since docks and berthing facilities are relatively easy to define, while anchorage zones in open waters are often less clearly delineated. This study uses administrative boundaries, known as Daerah Lingkungan Kerja Pelabuhan (Port Working Area, DLKr), to represent port areas as the Area of Interest (AoI) in AIS data processing. With DLKr, port boundaries can be clearly outlined and divided into several zones, enabling more accurate calculation of AIS-derived indicators. The research estimates port calls and port stay durations using the Stationary Marine Broadcast Method (SMBM), focusing on cargo, tanker, and passenger vessels at the Port of Tanjung Priok during 2024. The AIS-derived indicators generated with DLKr as the AoI show good performance, with an RMSE of 55.94 and a MAPE of 5.27%. In addition, the results are supported by the similarity in patterns and distribution of port call values when compared with the official maritime statistics provided by the Ministry of Transportation.
Hybrid Machine Learning to Evaluate the Incidence of Toddler Stunting through Integration of Multi-source Satellite Imagery and Official Statistics in East Nusa Tenggara Province Suhendra Widi Prayoga; Setia Pramana
IJCONSIST JOURNALS Vol 6 No 1 (2024): September
Publisher : International Journal of Computer, Network Security and Information System

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33005/ijconsist.v6i1.112

Abstract

Stunting is a serious health problem that impacts the quality of life of children under five. In 2023, East Nusa Tenggara recorded the second highest prevalence of stunting in Indonesia, influenced by health, socio-economic and environmental factors. In terms of the environment, remote sensing technology can be utilised to monitor environmental factors that contribute to stunting, such as vegetation conditions, access to clean water, and soil conditions. This study aims to evaluate the incidence of stunting among children under five using a hybrid machine learning approach, combining predictive modeling and cluster analysis. The results indicate thStunting is a serious health problem that impacts the quality of life of children under five. In 2023, East Nusa Tenggara recorded the second highest prevalence of stunting in Indonesia, influenced by health, socio-economic and environmental factors. In terms of the environment, remote sensing technology can be utilised to monitor environmental factors that contribute to stunting, such as vegetation conditions, access to clean water, and soil conditions. This study aims to evaluate the incidence of stunting among children under five using a hybrid machine learning approach, combining predictive modeling and cluster analysis. The results indicate that eXtreme Gradient Boosting Regressor (XGBR) is the best model for estimating stunting prevalence, with a Root Mean Squared Error (RMSE) of 3.2076 and an value of 0.7223. Meanwhile, for clustering results, K-Means Clustering is identified as the most effective method for grouping districts/cities based on socioeconomic and environmental factors. The clustering process produced two groups, such as vulnerable (Cluster 1) and highly vulnerable (Cluster 2), with connectivity, Dunn Index, and silhouette coefficient values of 29.290, 0.6931, and 0.4509, respectively. These findings are expected to serve as a basis for policymakers in formulating targeted interventions to reduce stunting rates, particularly in highly vulnerable areas. at eXtreme Gradient Boosting Regressor (XGBR) is the best model for estimating stunting prevalence, with a Root Mean Squared Error (RMSE) of 3.2076 and an value of 0.7223. Meanwhile, for clustering results, K-Means Clustering is identified as the most effective method for grouping districts/cities based on socioeconomic and environmental factors. The clustering process produced two groups, such as vulnerable (Cluster 1) and highly vulnerable (Cluster 2), with connectivity, Dunn Index, and silhouette coefficient values of 29.290, 0.6931, and 0.4509, respectively. These findings are expected to serve as a basis for policymakers in formulating targeted interventions to reduce stunting rates, particularly in highly vulnerable areas.
Co-Authors ., Yunofri Achmad Fauzi Bagus Firmansyah Addin Maulana Aditama, Farhan Satria Aini Izzati, Fitri Alifatri, La Ode Alistin, Zharifah Dhiya Ayu Amnur, Muh. Alfian Ana Lailatul Fitriyani Ana Lailatul Fitriyani Anang Kurnia Arie Wahyu Wijayanto Arif Handoyo Marsuhandi Ariya Jalaksana, Faruq Arkandana, M. Tharif Astrinariswari Rahmadian Prasetyo Astuti, Erni Tri Bintang Yuliani Manalu, Jernita Busaina, Ladisa Bustaman, Usman Charvia Ismi Zahrani Cholifa Fitri Annisa Dandy Adetiar Al Rizki Dede Yoga Paramartha Dede Yoga Paramartha Deli, Nensi Fitria Dewi Krismawati Dewi Krismawati Dhiar Niken Larasati Diory Paulus Pamanik Erni Tri Astuti Erwin Tanur Fadila Utami, Nurul Fajar Fathur Rachman Fajar Fatur Rachman Farakh Khoirotun Nasida Farhan Y. Hidayat Fitriyani, Ana Lailatul Fitriyyah, Nur Retno Geri Yesa Ermawan Gilang Hidayat, Muhammad Hady Suryono Hanafi, Zulfaning Tyas Hardiyanta, I Komang Y. Hendrawan, Daffa Hidayat, Farhan Y. Hizir Sofyan Hulliyyatus Suadaa, Lya I Komang Y. Hardiyanta I Nyoman Setiawan Imam Habib Pamungkas Jane, Giani Jovita Khairani, Fitri Krisela Fabrianne, Elisse Krismawati, Dewi Ladisa Busaina Larasati, Dhiar Niken Linta Ifada Linta Ifada Maftukhatul Qomariyah Virati Magfirah, Deanty Fatihatul Mariel, Wahyu Calvin Frans Maulana Faris Median Ramadhan, Alif Muhammad Farhan Muhammad Gazali, La Ode Muhammad Nur Aidi Muhammad Tharif Arkandana Mumtaz Siregar, Amir Munaf, Alfatihah Reno Maulani Nuryaningsih Soekri Putri Nasiya Alifah Utami Nastiar, Gery Nazuli, Muhammad Fachry Nensi Fitria Deli Nisa Rahayu Ananda Suwendra, Made Nora Dzulvawan Nurmalasari, Mieke Nurtia Nurtia Nurwijayanti Oktari, Rina S. Panuntun, Satria Bagus Paramartha, Dede Yoga Putro, Dimas Hutomo Rahman, Dimas Haafizh Rahmaniar, Masna Novita Rifqi Ramadhan Rina S. Oktari Rini Rahani Rutba, Sita Aliya Safrizal Rahman Safrizal Rahman, Safrizal Salim Satriajati Salwa Rizqina Putri Satria Bagus Panuntun Satria Bagus Panuntun Satria Bagus Panuntun Satria Bagus Panuntun Silalahi, Agatha Simanjuntak, Tigor Nirman Siswantining, Titin SITI MARIYAH Siti Mariyah Soemarso, Ditoprasetyo Rusharsono Suadaa, Lya Hulliyyatus Sugiri Suhendra Widi Prayoga Takdir Tasriah, Etjih Thosan Girisona Suganda Thosan Girisona Suganda Tigor Nirman Simanjuntak Usman Bustaman Usman Bustaman Utami, Nandya Rezky Wahyu Calvin Frans Mariel Wirata Raja Panjaitan, Eurorea Wiwin Srimulyani Yeni Rimadeni Yuniarti Yuniarti Yuniarti Yuniarti Yuniarto, Budi Zen, Rizqi Annisa