Claim Missing Document
Check
Articles

Prediction of Main Transportation Modes using Passive Mobile Positioning Data (Passive MPD) Farhan, Muhammad; Suadaa, Lya Hulliyyatus; Sugiri; Munaf, Alfatihah Reno Maulani Nuryaningsih Soekri Putri; Pramana, Setia
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 9 No 1 (2025): February 2025
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v9i1.6128

Abstract

Indicators of the main mode of transportation used by domestic tourists during tourism trips cannot yet be estimated using Passive MPD which is recorded based on the location of the BTS that captures the cellular activity of domestic tourists. Previous research on identifying transportation modes from Passive MPD has its own shortcomings because it only relies on speed and travel time features. Meanwhile, there is Active MPD which is recorded using active geo-positioning and real-time, where the research involves many features and has a data structure similar to Passive MPD. Therefore, this research aims to conduct a study of the implementation of the method used to identify modes of transportation in Active MPDs to Passive MPDs as an approach to predicting the main modes of transportation. As a result, the transportation mode identification method in the Active MPD can be implemented in the Passive MPD. The best accuracy of 83.56% was obtained by the LightGBM model using all features. However, the Multinomial Logistic Regression model, which only uses 10 selected features, is the most effective and efficient model with an accuracy of 76.43% and a much shorter execution time
Hybrid Machine Learning to Evaluate the Incidence of Toddler Stunting through Integration of Multi-source Satellite Imagery and Official Statistics in East Nusa Tenggara Province Suhendra Widi Prayoga; Setia Pramana
IJCONSIST JOURNALS Vol 6 No 1 (2024): September
Publisher : International Journal of Computer, Network Security and Information System

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33005/ijconsist.v6i1.112

Abstract

Stunting is a serious health problem that impacts the quality of life of children under five. In 2023, East Nusa Tenggara recorded the second highest prevalence of stunting in Indonesia, influenced by health, socio-economic and environmental factors. In terms of the environment, remote sensing technology can be utilised to monitor environmental factors that contribute to stunting, such as vegetation conditions, access to clean water, and soil conditions. This study aims to evaluate the incidence of stunting among children under five using a hybrid machine learning approach, combining predictive modeling and cluster analysis. The results indicate thStunting is a serious health problem that impacts the quality of life of children under five. In 2023, East Nusa Tenggara recorded the second highest prevalence of stunting in Indonesia, influenced by health, socio-economic and environmental factors. In terms of the environment, remote sensing technology can be utilised to monitor environmental factors that contribute to stunting, such as vegetation conditions, access to clean water, and soil conditions. This study aims to evaluate the incidence of stunting among children under five using a hybrid machine learning approach, combining predictive modeling and cluster analysis. The results indicate that eXtreme Gradient Boosting Regressor (XGBR) is the best model for estimating stunting prevalence, with a Root Mean Squared Error (RMSE) of 3.2076 and an value of 0.7223. Meanwhile, for clustering results, K-Means Clustering is identified as the most effective method for grouping districts/cities based on socioeconomic and environmental factors. The clustering process produced two groups, such as vulnerable (Cluster 1) and highly vulnerable (Cluster 2), with connectivity, Dunn Index, and silhouette coefficient values of 29.290, 0.6931, and 0.4509, respectively. These findings are expected to serve as a basis for policymakers in formulating targeted interventions to reduce stunting rates, particularly in highly vulnerable areas. at eXtreme Gradient Boosting Regressor (XGBR) is the best model for estimating stunting prevalence, with a Root Mean Squared Error (RMSE) of 3.2076 and an value of 0.7223. Meanwhile, for clustering results, K-Means Clustering is identified as the most effective method for grouping districts/cities based on socioeconomic and environmental factors. The clustering process produced two groups, such as vulnerable (Cluster 1) and highly vulnerable (Cluster 2), with connectivity, Dunn Index, and silhouette coefficient values of 29.290, 0.6931, and 0.4509, respectively. These findings are expected to serve as a basis for policymakers in formulating targeted interventions to reduce stunting rates, particularly in highly vulnerable areas.
Coastal Ecosystem Classification Using Satellite-Based Machine Learning Approaches Jane, Giani Jovita; Alifatri, La Ode; Tasriah, Etjih; Pramana, Setia
Jambura Journal of Biomathematics (JJBM) Volume 6, Issue 2: June 2025
Publisher : Department of Mathematics, Universitas Negeri Gorontalo

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37905/jjbm.v6i2.30466

Abstract

Sebagai negara kepulauan yang kaya akan sumber daya alam, Indonesia memiliki potensi ekonomi kelautan yang besar. Untuk mempertahankan potensi ekonomi ini dalam jangka panjang, ekonomi biru diperlukan sebagai konsep dalam menetapkan program pembangunan dan kebijakan publik. Salah satu cara untuk mengimplementasikan konsep tersebut adalah dengan menyusun neraca laut, yang kerangka kerjanya mengimplementasikan konsep ekonomi biru dalam bentuk neraca lingkungan. Neraca laut dapat dianggap mendukung pembentukan kebijakan dan program nasional suatu negara. Oleh karena itu, data spasial yang akurat yang mencerminkan kondisi terkini sangat penting untuk menyusun neraca ini. Namun, pengumpulan data tersebut dapat memakan biaya dan sumber daya yang besar, sehingga menjadi tantangan untuk memastikan ketersediaan informasi yang terkini dan akurat. Dalam konteks ini, sumber data alternatif dapat memberikan solusi yang layak. Penelitian sebelumnya telah berhasil membuktikan bahwa pemodelan pembelajaran mesin juga citra satelit Sentinel-1 dan Sentinel-2 mampu memetakan wilayah pesisir, seperti wilayah pasang surut dan bentik. Oleh karena itu, penelitian ini mencoba mengklasifikasikan ekosistem pesisir Taman Nasional Karimunjawa dengan memanfaatkan citra Sentinel-1 dan Sentinel-2 dan membandingkan hasil klasifikasi dari tiga metode pembelajaran mesin, yaitu Random Forest (RF), Support Vector Classification (SVC), dan Extreme Gradient Boosting (XGBoost), dan menganalisis perubahan ekosistem antara tahun 2020 dan 2023. Hasilnya menunjukkan bahwa RF memberikan hasil terbaik dalam melakukan klasifikasi untuk daerah bentik yang mencapai 0,77 dan 0,78 dalam skor F1 dan Koefisien Korelasi Matthew (MCC), sedangkan model SVC berhasil mencapai 0,83 dalam skor F1 dan MCC memberikan hasil terbaik untuk daerah pasang surut. Selanjutnya, luas terumbu karang dan padang lamun menurun masing-masing sebesar 6,524 km 2 dan 1,39 km 2 . Sedangkan, luas mangrove, kawasan terbangun, dan hutan menunjukkan sedikit perubahan.
INCORPORATING COMPLEX SURVEY DESIGN FOR ANALYSING THE DETERMINANT OF WOMEN IN REPRODUCTIVE AGE PARTICIPATION IN FAMILY PLANNING PROGRAM IN INDONESIA Astuti, Erni Tri; Rahani, Rini; Pramana, Setia
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 3 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss3pp1401-1410

Abstract

Data generated from complex survey are often treated as un-weighted simple random samples by analyst. This is unfortunate because everyone has different probability to be selected as sample in each stage of the complex survey design. Fail taking it into account will have serious impact in parameter and variance estimation. This paper aims to examining relationship between participation in family planning program and socio demographic status of women in reproductive age in Indonesia used data from latest Indonesian’s Demographic and Health Survey (IDHS). IDHS employs a multi stage stratified sampling design, thus there are a number of weights included in public-use IDHS datasets to account for this complex sample design. We found that the complex design features of the IHDS increased the variance estimates of the estimated parameters in the logistic regression models by about 1.325 – 1.88 times, compared to a simple random sampling. Therefore, using variance estimated from un-weighted simple random samples would lead to wrong conclusion of the significance parameter suggested by the model. The result also found that all of socio demographics variables used as predictors are significant. Thus, women with moderate education, unemployment, exposed by media, living in rural community and wealthy, have spouse that have moderate education and have a job tend to participate in family planning program.
Genetic Cluster Analysis of Insulin Resistance Using KNN Imputation and FABIA-CCA Biclustering Soemarso, Ditoprasetyo Rusharsono; Siswantining, Titin; Pramana, Setia
Enthusiastic : International Journal of Applied Statistics and Data Science Volume 5 Issue 1, April 2025
Publisher : Universitas Islam Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20885/enthusiastic.vol5.iss1.art10

Abstract

Type 2 diabetes mellitus (T2DM) is a metabolic disorder primarily driven by insulin resistance, involving complex genetic regulation. Understanding the molecular mechanisms underlying insulin resistance is crucial for identifying therapeutic targets. This study compared the performance of two biclustering algorithms, factor analysis for bicluster acquisition (FABIA) and the Cheng and Church algorithm (CCA), in analyzing gene expression data associated with insulin resistance. Using the GSE19420 dataset, simulated missing values were introduced to evaluate the robustness of both methods. Results showed that CCA consistently achieved lower mean squared error (MSE) in reconstructing gene expression patterns, suggesting higher accuracy in capturing co-expression structures. Nevertheless, FABIA effectively detected sparse, biologically relevant clusters. Notably, key genes such as MYO5B, DLG2, AXIN2, and PTK7 were identified within the biclusters, supporting their involvement in insulin signaling and metabolic regulation. These findings underscore the need to select biclustering methods that align with specific analytical goals and offer insights into gene networks involved in insulin resistance.
Analisis Sentimen dan Pemodelan Topik Opini Publik Terkait Data Badan Pusat Statistik Tahun 2024 Rahman, Dimas Haafizh; Alistin, Zharifah Dhiya Ayu; Pramana, Setia
Seminar Nasional Official Statistics Vol 2025 No 1 (2025): Seminar Nasional Official Statistics 2025
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/semnasoffstat.v2025i1.2365

Abstract

The role of BPS has become increasingly crucial with the rising demand and sources of data over time. The quality of BPS data is evaluated through the Data Needs Survey (SKD). The 2024 SKD indicates that 98.16% of consumers are satisfied with the quality of BPS data. However, this evaluation only involved data consumers from BPS PST, and there remains a time gap between the implementation and dissemination of the survey results. Social media platform X, which is popular in Indonesia, allows its users to express their opinions through tweets. This research is conducted to understand public sentiment, identify the best classification model, and discover topics discussed by the public regarding BPS data based on tweets from the X platform in 2024. The tweets were taken through labeling and preprocessing before applying Machine Learning methods to classify public sentiment. The Support Vector Machine (SVM) method, with a weighted average of 0.68, performed best compared to Naïve Bayes, Rocchio Classification, and K-NN in modeling public opinion sentiment. The implementation of LSA and LDA discovered topics consisting of public opinions and issues related to BPS data such as poverty rate manipulation and BPS data as a credible source.
Peramalan Produksi Perikanan Laut di Provinsi Jawa Tengah: Pendekatan Statistik dan Machine Learning Amnur, Muh. Alfian; Pramana, Setia
Seminar Nasional Official Statistics Vol 2025 No 1 (2025): Seminar Nasional Official Statistics 2025
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/semnasoffstat.v2025i1.2417

Abstract

Fisheries production in Central Java Province experiences seasonal fluctuations that affect supply stability and fishermen's income. This study aims to analyze the production trends from 2013 to 2023 and compare the performance of the SARIMA and Random Forest models in forecasting fishery production sold at Fish Auction Sites (TPI). Based on evaluation metrics including MAE, RMSE, and MAPE, the SARIMA(8,1,1)(1,1,0)[12] model demonstrated the best performance with values of 2930.12, 3749.83, and 15.40, respectively. Additionally, the SARIMA model was used to forecast production for January 2024, resulting in an estimated output of 26,210.63 tons. This forecast is expected to assist stakeholders in monitoring fishery production in Central Java Province.
An Intelligent Conversational Agent Using Self-Reflective Retrieval-Augmented Generation for Enhanced Large Language Model Support in National Accounts Learning Farhan, Muhammad; ., Yunofri; Tasriah, Etjih; Hulliyyatus Suadaa, Lya; Pramana, Setia
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2025i1.575

Abstract

BPS Statistics Indonesia plays a strategic role in compiling balance sheet statistics as the foundation for national policy analysis. This role requires a deep understanding of the concepts, definitions, and compilation standards outlined in the System of National Accounts (SNA) manual. However, in practice, comprehending such complex technical documents is not always straightforward. To address this challenge, this study proposes the development of an intelligent conversational agent in the form of a chatbot that implements the Self-Multimodal RAG approach. This approach integrates self-reflection mechanisms to generate more accurate and relevant responses. The evaluation was conducted using the LLM-as-a-Judge framework across four metrics: answer correctness, answer relevancy, context relevancy, and context faithfulness. Experimental results demonstrate that the Self-Reflective RAG achieved a score of 80% on the answer correctness metric, with competitive performance in terms of relevancy and faithfulness. From the chatbot implementation perspective, black-box testing confirmed that all functionalities operated as expected, while system usability testing using the CSUQ instrument yielded a score of 74.704%, indicating that the chatbot is well-accepted by users.
Business Description Categorization to the Five-Digit Indonesian Standard Classification of Business Field (KBLI) Using Machine Learning and Transfer Learning Amnur, Muh. Alfian; Muhammad Gazali, La Ode; Mumtaz Siregar, Amir; Ariya Jalaksana, Faruq; Nisa Rahayu Ananda Suwendra, Made; Fadila Utami, Nurul; Median Ramadhan, Alif; Krisela Fabrianne, Elisse; Wirata Raja Panjaitan, Eurorea; Aini Izzati, Fitri; Bintang Yuliani Manalu, Jernita; Gilang Hidayat, Muhammad; Hulliyyatus Suadaa, Lya; Yuniarto, Budi; Pramana, Setia
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2025i1.719

Abstract

The Indonesian Standard Classification of Business Fields (KBLI) is essential for economic statistics, yet manual classification of business descriptions to five-digit KBLI codes is time-consuming and prone to inconsistencies. This study aims to develop and compare machine learning (Support Vector Machine and Random Forest) and transfer learning  (IndoBERT) models for automating KBLI classification, supported by the preparation of synthetic and real-world datasets for model training. The synthetic data were generated using large language models, validated through human majority voting and complemented with realworld data from the National Labor Force Survey (Sakernas) and the Micro and Small Industry Survey (IMK). The findings indicate that Fine-tuned IndoBERT achieved superior performance, achieving an F1-score of 92.99% and an accuracy of 93.40% on synthetic data, alongside top-1, top-5, and top-10 accuracies of 32.93%, 54.71%, and 63.24% on real-world data. The deployment of fine-tuned IndoBERT as a RESTful API demonstrates its scalability and efficiency, presenting a reliable solution for large-scale KBLI classification in official statistics. 
Co-Authors ., Yunofri Achmad Fauzi Bagus Firmansyah Addin Maulana Aditama, Farhan Satria Aini Izzati, Fitri Alifatri, La Ode Alistin, Zharifah Dhiya Ayu Amnur, Muh. Alfian Ana Lailatul Fitriyani Ana Lailatul Fitriyani Anang Kurnia Arie Wahyu Wijayanto Arif Handoyo Marsuhandi Ariya Jalaksana, Faruq Arkandana, M. Tharif Astrinariswari Rahmadian Prasetyo Astuti, Erni Tri Bintang Yuliani Manalu, Jernita Busaina, Ladisa Cahyono, Bintang Dwitya Charvia Ismi Zahrani Cholifa Fitri Annisa Dandy Adetiar Al Rizki Dede Yoga Paramartha Dede Yoga Paramartha Deli, Nensi Fitria Dewi Krismawati Dewi Krismawati Dhiar Niken Larasati Diory Paulus Pamanik Erni Tri Astuti Erwin Tanur Fadila Utami, Nurul Fajar Fathur Rachman Fajar Fatur Rachman Farakh Khoirotun Nasida Farhan Y. Hidayat Fitriyani, Ana Lailatul Fitriyyah, Nur Retno Geri Yesa Ermawan Gilang Hidayat, Muhammad Hady Suryono Hanafi, Zulfaning Tyas Hardiyanta, I Komang Y. Hendrawan, Daffa Hidayat, Farhan Y. Hizir Sofyan Hulliyyatus Suadaa, Lya I Komang Y. Hardiyanta I Nyoman Setiawan Imam Habib Pamungkas Jane, Giani Jovita Khairani, Fitri Krisela Fabrianne, Elisse Krismawati, Dewi Ladisa Busaina Linta Ifada Linta Ifada Maftukhatul Qomariyah Virati Magfirah, Deanty Fatihatul Mariel, Wahyu Calvin Frans Maulana Faris Median Ramadhan, Alif Muhammad Farhan Muhammad Gazali, La Ode Muhammad Nur Aidi Muhammad Tharif Arkandana Mumtaz Siregar, Amir Munaf, Alfatihah Reno Maulani Nuryaningsih Soekri Putri Nasiya Alifah Utami Nazuli, Muhammad Fachry Nensi Fitria Deli Nisa Rahayu Ananda Suwendra, Made Nora Dzulvawan Novandra, Rio Nur Retno Fitriyyah Nurmalasari, Mieke Nurtia Nurtia Nurwijayanti Oktari, Rina S. Panuntun, Satria Bagus Paramartha, Dede Yoga Putro, Dimas Hutomo Rahman, Dimas Haafizh Rahmaniar, Masna Novita Rifqi Ramadhan Rimadeni, Yeni Rina S. Oktari Rini Rahani Rutba, Sita Aliya Safrizal Rahman Safrizal Rahman, Safrizal Salim Satriajati Salwa Rizqina Putri Satria Bagus Panuntun Satria Bagus Panuntun Satria Bagus Panuntun Satria Bagus Panuntun Silalahi, Agatha Siswantining, Titin SITI MARIYAH Siti Mariyah Soemarso, Ditoprasetyo Rusharsono Suadaa, Lya Hulliyyatus Sugiri Suhendra Widi Prayoga Takdir Tasriah, Etjih Thosan Girisona Suganda Thosan Girisona Suganda Tigor Nirman Simanjuntak Titin Siswantining Usman Bustaman Usman Bustaman Utami, Nandya Rezky Wahyu Calvin Frans Mariel Wirata Raja Panjaitan, Eurorea Wiwin Srimulyani Yuniarti Yuniarti Yuniarto, Budi Zen, Rizqi Annisa