cover
Contact Name
Lutfi Rahmatuti Maghfiroh
Contact Email
lutfirm@stis.ac.id
Phone
+6281381703898
Journal Mail Official
icdsos@stis.a.cid
Editorial Address
Jalan Otto Iskandardinata 64 C Jakarta
Location
Kota adm. jakarta timur,
Dki jakarta
INDONESIA
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND OFFICIAL STATISTICS
ISSN : 28099842     EISSN : -     DOI : -
Core Subject : Science,
International Conference on Data Science and Official Statistics International Conference on Data Science and Official Statistics (ICDSOS) 2023 is organized by Politeknik Statistika STIS and Statistics Indonesia (BPS). This international conference in collaboration with Forum Pendidikan Tinggi Statistika (FORSTAT), Ikatan Statistisi Indonesia (ISI), United Nations Economic and Social Commission for Asia and the Pacific (UNESCAP), and United Nations Statistics Division (UNSD). The ICDSOS will bring together statisticians and data scientists from academia, official statistics, health sector and business, junior and senior professionals, in an inviting hybrid environment on November 24th - 25th, 2023. Dealing with the theme of this conference is Harnessing Innovation in Data Science and Official Statistics to Address Global Challenges towards the Sustainable Development Goals. DATA SCIENCE Machine Learning and Deep Learning Data Science and Artificial Intelligence (AI) Data Mining and Big Data Statistical Software Information System Development for Official Statistics Remote Sensing to Strengthen Official Statistics Other data science relevant topic APPLIED STATISTICS Applied Multivariate Analysis Applied Time Series Analysis Applied Spatial Statistics Applied Bayesian Statistics Microeconomics Modelling and Applications Macroeconomics Modelling and Applications Econometrics Modelling and Applications Quantitative Public Policy and Statistical Analysis Applied Statistics on Demography Applied Statistics on Population Studies Applied Statistics on Biostatistics and Public health Other applied statistics relevant topic OFFICIAL STATISTICS Official Statistics Survey Methodology Developments Data Collection Improvements Sustainable Development Goals (SDGs) Indicators Estimation Small Area Estimation (SAE) Non Response and Imputation Methods Sampling Error and Non Sampling Error Evaluation Benchmarking Regional Official Statistics Other official statistics relevant topic
Arjuna Subject : Umum - Umum
Articles 151 Documents
Trajectory of life expectancy and its relation with socio-economic indicators among developing countries in Southeast Asian Madona Yunita Wijaya; Yanne Irene; Iqbal Rachadi
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.307

Abstract

Life expectancy is a one of key global health indicators and plays an important role in health policy measures. The status of a country indirectly influences the life expectancy of a nation. Developing countries have slower economic progress compared to developed countries, which in turn affects the well-being of the population. Therefore, this study aims to analyze the trend of life expectancy among developing countries in Southeast Asian and assess the influence of socio-economic indicators in life expectancy. Linear mixed effects model is used to model the association between socioeconomic factors and life expectancy. The results indicate that GDP growth rate, GDP per capita, and unemployment rate have significant impact on life expectancy and the impacts depend on gender. Life expectancy among females is generally higher than males. Prediction of life expectancy in males in year 2025 is found the lowest in Myanmar with average of 64.2 years (95%CI: 60.8-77.1) and the highest in Thailand with average of 76.2 years (95%CI: 60.7-76.9). Meanwhile, prediction of life expectancy in females is found the lowest in Timor Leste with average of 71.1 years (95%CI: 67.8-83.9) and the highest in Thailand with average of 84.3 years (95%CI: 68.7-84.9).
Opportunities and Challenges of Remote Sensing, Geospatial Data, and Machine Learning in Obtaining Accessibility and Location Information for Sustainable Development in Indonesia Terry Devara
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.309

Abstract

With the advancement of technologies so does the data collection method which creates a large, rapid, and diverse stream of data. Statistic Indonesia (BPS) has also encouraged to utilize this by starting to collect geospatial information on respondents and public facilities. To keep up with this a change needs to be made in processing methods to accommodate massive, high-dimensional, and multiform data collected in different forms such as machine learning. This progression also opens up a new opportunity for tackling various statistical data problems such as accessibility and location data. Remote sensing is one of the big data sources that undergoes a lot of changes shown in the high spatial and temporal resolution satellite imagery availability, together with the BPS geotagging data shows great promise in classifying land use and geospatial analysis. Even so, there are still some challenges in remote sensing as well as other geospatial data utilization. The goals of this review paper are to study the opportunities and challenges in utilizing remote sensing, geospatial data, and machine learning for accessibility and location information. In this paper, we explore the possibilities and limitations in its implementation into SDGs indicators that involve accessibility and location such as indicators 9.1.1, 11.1.1, 11.2.1, 11.3.1, and 11.7.1 including other variables needed for the calculation like access to public facilities. Moreover, our experiment using geotagging data shows potential in improving proportion estimation when compared to using a simple ratio. Our DEGURBA following the UN definition using machine learning LULC for dasymetric mapping also provides more insight compared to the existing data. We can conclude that there are great opportunities in applying remote sensing and other geospatial data to monitor the accessibility and location to further sustainable development in Indonesia.
Comparison of Kernel Smoothing and Local Polynomial Smoothing Method in Overcoming Age Heaping Nadia Arsyta Putri; Erni Tri Astuti; Lalu Moh Arsal Fadila; Salsabil Syadza Hafizhah
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.312

Abstract

Age data plays an important role in every aspect yet there are found age misreporting. It involves digit preference that causes build up in a certain age. Digit preference in demography is called age heaping that often happens at age with 0 and 5 as the last digit. Age heaping induces poor data quality and data bias that could influence government policy making. Two indicators used to detect age heaping are Whipple Index (WI) and Myers Blended Index (MBI). Methods to cope with age heaping are nonparametric regression approaches which are Kernel Smoothing and Local Polynomial Smoothing. The objective of this research is to measure and elevate the quality of population age data and population mortality data in Sensus Penduduk (SP) 2020 as well as comparing methods between Kernel Smoothing and Local Polynomial Smoothing. The data being used in this paper is SP2020 which the research variables are age population, age of death, and total population. The result shows that the data quality of total population death is inaccurate compared to total population thus needs a smoothing process to improve age data to population data accuration. The method that has better accuracy is the Local Polynomial Smoothing method.
High-resolution-gridded rainfall dataset derived from surface observation by adjustment of satellite rainfall product Achmad Rifani; Muhammad Rezza Ferdiansyah
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.314

Abstract

A high-resolution-gridded rainfall dataset is essential for many purposes.  Such as analysis of extreme weather conditions, natural-disaster mitigation, or to be used as an input to the hydrological model. Satellite-based rainfall products (e.g., Global Satellite Mapping of Precipitation-GSMaP) can solve the spatial and temporal issues despite their rainfall intensity often being under or overestimated. This research aims to provide a high-resolution rainfall dataset by adjusting the 0.1 deg GSMaP rainfall data to the surface rainfall data from several observation points in the greater Jakarta area (Jabodetabek) during January 2020 when several flooding occurred in Jakarta. The adjustment process includes calculating the bias between the satellite estimation in the nearest observation point and interpolating the error back to the 0.01 deg grid by using radial basis function (RBF) to obtain the correction factor in every grid point, GSMaP data then adjusted by the correction factor. We implemented the method in January 2020 when several floods occurred in Jakarta. The result reveals a more realistic rainfall spatial distribution than regularly interpolating the observation data. The validation of adjusted rainfall estimation at the verification points also shows a reduction in domain-wide RMSE by 30 – 80%.
Can Paddy Growing Phase Produce an Accurate Forecast of Paddy Harvested Area in Indonesia? Analysis of the Area Sampling Frame Results Kadir Ruslan; Octavia Rizky Prasetyo
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.316

Abstract

Our study aims to evaluate the accuracy of the forecasts produced based on the paddy growing phase obtained from the results of the Area Sampling Frame (ASF) Survey and, as a comparison, proposes an alternative forecast method taking into account the seasonal pattern and hierarchical structure of the national paddy harvested area estimation obtained from the ASF to improve the accuracy. In doing so, we calculated the MAPE by comparing the realization of paddy harvested area during the period January to September 2022 with their forecasts produced from the area of generative, late vegetative, and early vegetative phases. We also implemented a Hierarchical forecasting method on monthly data of the harvested area from January 2018 to August 2022 for all provinces. Specifically, we applied the bottom-up method for the reconciliation and the rolling window method to produce a three-consecutive month forecast for the period January to September 2022. We found that the accuracy prediction based on the paddy growing phase is moderately accurate. The combination of the bottom-up reconciliation method and the SARIMA model produces a much better accuracy for the national figure of paddy harvested area as shown by a lower MAPE. Our findings suggest that the Hierarchical forecasting method could be an alternative for the prediction of harvested area based on the ASF results other than the prediction obtained from the standing crops.
Does Farm Size Matter for Food Security Among Agricultural Households? Analysis of Indonesia’s Agricultural Integrated Survey Results Kadir Ruslan; Octavia Rizky Prasetyo
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.318

Abstract

Most agricultural households in Indonesia are small-scale farmers making them prone to food insecurity. Until recently, no study has assessed the impact of farm size and sociodemographic characteristics on the food insecurity status of agricultural households using a nationwide agricultural household survey in Indonesia. Our study aims to address this gap by utilizing the results of the first Indonesian Agricultural Integrated Survey conducted by BPS in 2021. Applying the Rasch Model, Multinomial Logistic Regression, and Ordinary Least Squares Regression, we found that the farm size has a positive impact in lowering the likelihood of experiencing moderate or severe levels of food insecurity among agricultural households. Our study also found that agricultural households with a higher probability of being food insecure are characterized by having higher members of households, relying only on agricultural activities for their livelihood, lower education attainment of household heads, and being led by female farmers.
Automated Indonesian Text Augmentation with Web-Based Application Using Flask Framework Iftitah Athiyyah Rahma; Lya Hulliyyatus Suadaa
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.324

Abstract

In real world, data and resources available for text classification are limited. One of issues on labelled data is imbalanced data. Problem of imbalanced data affects performance and accuracy of model because the model only focuses on data with majority label. Therefore, the measure of model accuracy cannot describe the true quality of model. To overcome this, an oversampling approach is carried out. Text-based oversampling is known as text augmentation. However, NLP resources for Indonesian, especially in performing text augmentation, are still limited. Therefore, this research conducts development of a web application to augment Indonesian text automatically. The application was bulit using prototype method. The application was successfully built and can facilitate users to perform augmentation automatically for all texts in the dataset. Users can select preferred augmentation technique and are required to upload datasets as input. The output of application is same dataset file as input with an additional column containing synthetic text augmented by the application. This application can contribute to further research in performing text augmentation for Indonesians.
Harnessing Blockchain in BPS Microdata Dissemination Florencia Satwika Genah; Dea Venditama
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.325

Abstract

Towards achieving BPS-Statistics Indonesia missions, the dissemination process of statistical products must be carried out well. One of BPS-Statistics Indonesia statistical products is the microdata. In this case, the activity of disseminating microdata should be conducted through the implementation of feasible best practices. Related to that, in a fast-paced of the ever-changing world that heavily relies on the evolution of technology, the process of bringing out the best efforts in disseminating microdata must as well follow the rhythm of the moving technology to meet the current needs of the digital society, because otherwise it will be obsolete as time goes by. One of the important issues is the limitation of the existing system in tracking microdata to ensure its authenticity and integrity, in case where the users have purchased the microdata from BPS-Statistics Indonesia. In addressing this traceability issue, a solution through the implementation of the cutting-edge Blockchain technology is considered. A design is proposed to incorporate Blockchain into the existing mechanism of BPS-Statistics Indonesia microdata dissemination. Therefore, a system architecture and a schema for smart contract utilization are proposed to reinforce the microdata tracking.
Development of Paddy Yield Gap Between Java and Outside Java: Does It Have a Contribution to Paddy Yield Improvement from 2018 to 2021? Kadir Ruslan; Octavia Rizky Prasetyo
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.330

Abstract

Increasing the paddy yield is crucial for Indonesia to maintain its national rice sufficiency amid the consistent depletion of wetland paddy areas. In this regard, the yield disparities between regions are challenging, particularly between Java and outside Java. Our study aims to examine the development of the paddy yield gap between the two regions from 2018 to 2021 and its contribution to paddy yield improvement during the period. Using the results of the National Crop-cutting Survey, we found that while the paddy yield in Java outperformed the paddy yield outside Java, the yield difference between the two regions narrowed from around 26 per cent in 2018 to 22 per cent in 2021 due to the increase of the yield outside Java. The results of the Blinder-Oaxaca decomposition suggested that the narrowing gap has a significant contribution to the national paddy yield increase from 2018 to 2021. Our finding confirms that narrowing the yield gap between the two regions by increasing the yield outside Java is crucial to improving paddy yield in Indonesia. Our study also pointed out that improvement in irrigation systems, fertilizer use, and fertilizer assistance are important factors in maintaining the paddy yield and narrowing the gap.
Comparison of Naive Bayes, K-Nearest Neighbor, and Support Vector Machine Classification Methods in Semi-Supervised Learning for Sentiment Analysis of Kereta Cepat Jakarta Bandung (KCJB) Muhammad Farhan; Renata De La Rosa Manik; Hana Raihanatul Jannah; Lya Hulliyyatus Suadaa
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2023 No. 1 (2023): Proceedings of 2023 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2023i1.332

Abstract

Transportation technology has developed very rapidly in the 21st century; one of them is high-speed trains. Currently, the Indonesian government is implementing the construction of the Kereta Cepat Jakarta-Bandung (KCJB) project in collaboration with China. The construction of this fast train project has attracted various comments and opinions from the public on Twitter and social media. This research aims to compare the classification methods of Naïve Bayes, K-Nearest Neighbor (K-NN), and Support Vector Machine (SVM) in classifying sentiment in tweets about high-speed trains obtained by scraping Twitter. The comparison process was carried out using semi-supervised learning, and the results showed that the semi-supervised SVM model had the best performance with an average accuracy of 86%, followed by the semi-supervised Naïve Bayes model and semi-supervised K-NN with an average accuracy of 81% and 58% respectively. Overall, the prediction results from the three models conclude that there are more tweets with negative sentiment than tweets with positive and neutral sentiment.

Page 11 of 16 | Total Record : 151