cover
Contact Name
Lutfi Rahmatuti Maghfiroh
Contact Email
lutfirm@stis.ac.id
Phone
+6281381703898
Journal Mail Official
icdsos@stis.a.cid
Editorial Address
Jalan Otto Iskandardinata 64 C Jakarta
Location
Kota adm. jakarta timur,
Dki jakarta
INDONESIA
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND OFFICIAL STATISTICS
ISSN : 28099842     EISSN : -     DOI : -
Core Subject : Science,
International Conference on Data Science and Official Statistics International Conference on Data Science and Official Statistics (ICDSOS) 2023 is organized by Politeknik Statistika STIS and Statistics Indonesia (BPS). This international conference in collaboration with Forum Pendidikan Tinggi Statistika (FORSTAT), Ikatan Statistisi Indonesia (ISI), United Nations Economic and Social Commission for Asia and the Pacific (UNESCAP), and United Nations Statistics Division (UNSD). The ICDSOS will bring together statisticians and data scientists from academia, official statistics, health sector and business, junior and senior professionals, in an inviting hybrid environment on November 24th - 25th, 2023. Dealing with the theme of this conference is Harnessing Innovation in Data Science and Official Statistics to Address Global Challenges towards the Sustainable Development Goals. DATA SCIENCE Machine Learning and Deep Learning Data Science and Artificial Intelligence (AI) Data Mining and Big Data Statistical Software Information System Development for Official Statistics Remote Sensing to Strengthen Official Statistics Other data science relevant topic APPLIED STATISTICS Applied Multivariate Analysis Applied Time Series Analysis Applied Spatial Statistics Applied Bayesian Statistics Microeconomics Modelling and Applications Macroeconomics Modelling and Applications Econometrics Modelling and Applications Quantitative Public Policy and Statistical Analysis Applied Statistics on Demography Applied Statistics on Population Studies Applied Statistics on Biostatistics and Public health Other applied statistics relevant topic OFFICIAL STATISTICS Official Statistics Survey Methodology Developments Data Collection Improvements Sustainable Development Goals (SDGs) Indicators Estimation Small Area Estimation (SAE) Non Response and Imputation Methods Sampling Error and Non Sampling Error Evaluation Benchmarking Regional Official Statistics Other official statistics relevant topic
Arjuna Subject : Umum - Umum
Articles 151 Documents
Classification of Paddy Growth Phase with Machine Learning Algorithms to Handle Imbalanced Multi-Class Big Data Hady Suryono; Heri Kuswanto; Nur Iriawan
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.45

Abstract

The global Sustainable Development Goals (SDGs) adopted by countries in the world have significant implications for national development planning in Indonesia in the period 2015 to 2030. The Agricultural sector is one of the most important sectors in the world and has a very important contribution to achieving the goals. Availability of accurate paddy production data must be available to measure the level of food security. This can be done by monitoring the growth phase of paddy and predicting the classification of its growth phase accurately and precisely. The paddy growth phase has 6 classes with the number of class members usually not the same (imbalanced data). This study describes the results of the classification of paddy growth phases with imbalanced data in Bojonegoro Regency, East Java in 2019 using machine learning algorithms on the Google Earth Engine (GEE) platform. Classification is done by Classification and Regression Tree, Support Vector Machine, and Random Forest. Oversampling technique is used to deal the problem of imbalanced data. The Area Sampling Frame survey in 2019 conducted by BPS was used as a label for classification model training. The results showed that the overall accuracy (OA) using the Random Forest algorithm by modifying the dataset using oversampling was 82.30% and the kappa statistic was 0.76, outperforming the SVM and CART algorithms.
Knowledge-based Utilization in Organizational IT Support. A Case Study at BPS-Statistics Indonesia Herlambang Permadi; Dana Indra Sensuse
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.48

Abstract

Many problems in the IT sector are experienced by employees in carrying out daily government activities. The problems faced often disrupt government activities in providing services to the community. This study analyzes the IT problems that are often found in organizations and their impacts. As many as 43 people have participated in the survey to identify what problems are often experienced and the impact they have had. The survey started with 7 IT service groups and produced 37 IT problems. The result is an implementation of a knowledge-based system that can help employees in solving IT problems on their own in their work environment.
Estimation of Education Indicators in East Java Using Multivariate Fay-Herriot Model Novia Permatasari; Azka Ubaidillah
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.51

Abstract

Education is an important aspect in improving human resources. Data availability of education indicators in a low administrative level is needed as a basis for education planning in that region. The problem of sample size when provide a low administrative level data can be overcome by indirect estimation, namely Small Area Estimation (SAE). SAE is able to increase the effectiveness of the survey sample size by using the strength of neighbouring areas and information from auxiliary variables related to the variables of interest. We obtain simulation study to compare multivariate model to univariate model and implement multivariate model to estimate three education indicators which are obtained from the National Socio-Economic Surveys by Statistics Indonesia. Simulation results are in line with previous studies, where the multivariate Fay-Herriot model with p variable has smaller of mean squares error (MSE) than the univariate model. The model implementation to estimate CrudeParticipation Rate (APK), School Participation Rate (APS), and Pure Participation Rate (APM) also shows that the multivariate model produces smaller RRMSE than the direct estimates. It can be concluded that multivariate model is able to produce more efficient estimates than direct estimation and univariate model.
Topic Modelling in Knowledge Management Documents BPS Statistics Indonesia Muhammad Yunus Hendrawan; Nucke Widowati Kusumo Projo
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.52

Abstract

Knowledge management is an important activity in improving the performance an organization. BPS Statistics Indonesia has recently implemented such a system to improve the quality and efficiency of business processes. The purposes of this research are: 1) implementing topic modelling on BPS Knowledge Management System to identify groups of document topics; 2) providing recommendations on which the best topic modelling; 3) building a web service function of topic modelling for BPS that includes data preprocessing function and topic group recommendation function. This study applies the Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) topic modelling methods to determine the best grouping techniques for knowledge management systems in BPS Statistics Indonesia. The results show that the LDA model using Mallet is the best model with 25 topic groups and a coherence score of 0.4803. The performance result suggest that the best modelling method is the LDA. The LDA model is then successfully implemented in RESTful web service to provide services in the preprocessing function and topic recommendations on documents entered into the Knowledge Management System BPS.
What We Know from Telemedicine Data in Indonesia? Study case using Alodokter, Dokter.id, and Honestdocs Faza Nur Fuadina; Nucke Widowati Kusumo Projo; Siti Mariyah
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.53

Abstract

The internet and technology development arise in various aspects of life in Indonesia, including in the health sector with e-health. Telemedicine utilization as a form of e-health was still rare among Indonesians because its existence is not as much as e-commerce that is more related to the economic sector. The COVID-19 pandemic has limited people's movement to get health care, but it made people use telemedicine in Indonesia. This research aims to analyze telemedicine utilization in Indonesia and see the health phenomena captured in the data. This research uses descriptive analysis and text mining to determine the utilization of telemedicine with the Named Entity Recognition (NER) and Latent Dirichlet Allocation (LDA) methods. In addition, a literature review is also used to identify the potential use of telemedicine data in collecting health statistics in Indonesia. The results show that telemedicine has been widely used in Indonesia. The clinical teleconsultation data and article titles on telemedicine produce various health topics. Therefore, telemedicine data can potentially be used as a source for collecting health statistics.
A Simple Approach using Statistical-based Machine Learning to Predict the Weapon System Operational Readiness Arwin Datumaya Wahyudi Sumari; Dimas Shella Charlinawati; Yuri Ariyanto
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.58

Abstract

Weapon system operational readiness is a critical requirement to ensure the combat readiness in order to guarantee the state defense sustainability time by time. Weapon systems are only operated by the military and their readiness are programmed every year based on some factors such as the amount of the allocated budget, the weapon system strength, and its circulation. Usually, the weapon system readiness is programmed based on the planner’s experiences that are inherited from time to time. In this research, we proposed a simple approach by using statistical-based machine learning method called linear regression for helping the planner to predict the weapon system operational readiness faced to its affecting factors such as scheduled and unscheduled maintenance. We used a dataset from a randomized primary data for 5 years from year 2016 to year 2020 to predict year 2021. To ensure the performance of the model, two measurements are used namely, Mean Absolute Percentage Error (MAPE) to measure its accuracy and goodness, and R-squared (R2) to measure the ability of the independent variables, the weapon system circulation, influences the dependent variable, the weapon system readiness. From the measurement results, the models, in general, are able to achieve MAPE as much as 1.99% that has interpretation as very accurate prediction with the accuracy of 98.02%. On the other hand, the system is able to achieve R2 as much as 84.15% that means the combination of the independent variables altogether have given a strong influence to the dependent variable. The higher the value of R2 the better the model is. Our research conclude that linear regression is the proper machine learning model for predicting the weapon system operational readiness.
Estimation of Air Pollutants using Time Series Model at Coalfield Site of India Arti Choudhary; Pradeep Kumar
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.59

Abstract

Assessment of air pollutants and quality is an intricate task because of dynamic nature, unpredictability and high inconsistency in space and time. In this study, a time series moving average (MA) model is employed to estimate air pollutants (PM2.5, PM10, NO2, NOX, O3, SO2 and CO) over the coalfield site of India. The estimated O3 with Adj. R2 = 0.958 was identified as the most accurate estimation followed by other estimated pollutants. Though, results for the estimated PM2.5 (Adj. R2 = 0.950) and NO2 (Adj. R2 = 0.949) were found almost similar to the results of O3 (Adj. R2 = 0.958). The estimated CO with Adj. R2 = 0.887 was identified lower among all the estimated pollutants was also found very well. The existing results of the study demonstrate that MA model permits us to precisely estimate daily basis pollutant concentrations, for the different sites of India.
The Effect of Human Capital Inequality on Income Inequality: Evidence from Indonesia: An Application of Generalized Method of Moment Estimation Hafizh Meyzar Aqil; Dwi Wahyuniati
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.63

Abstract

Education inequality in Indonesia tends to experience a downward trend which indicates that the education distribution is more equally distributed from year to year. this phenomenon should lead to a reduction in income inequality. However, income inequality in Indonesia has increased compared to 9 years ago. This study intends to look at the human capital inequality condition in provinces in Indonesia and analyze the effect of human capital inequality on income inequality. The Gini coefficient concept is used to measure human capital inequality and income inequality. The annual panel data covered 34 provinces in Indonesia from 2015 – 2019. The analytical methods used dynamic panel data regression using the Generalized Method of Moment (GMM) Arellano-Bond approach. The results indicate income inequality with a lag of 1 year, literacy rate, and trade openness have a negative and significant effect on income inequality. Furthermore, the human capital inequality and the average years of schooling have a positive and significant effect on income inequality. So, to reduce income inequality, policymakers are advised to minimize human capital inequality, especially in the education sector by paying attention to conditions in priority provinces.
Do Tourist Attraction Objects Implement Health Protocols? Analysis of Tourist Attraction Object in East Java Province Using Google Maps Review Disya Pratistaning Ratriatmaja; Nucke Widowati Kusumo Projo
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.64

Abstract

The COVID-19 pandemic has impacted the tourism sector, particularly the Tourist Attraction Object (TAO) in Indonesia. This research aims: to analyse the implementation of health protocols and facility conditions at TAO, to analyse the change in visitor sentiment and rating towards TAO before and during the COVID-19 pandemic, to analyse the close relationship between ratings and reviews of visitor sentiment on TAO, to analyse the possibility of web scraping data to complement tourism data from BPS Statistics Indonesia. Using Google Maps review, this research uses the Multinomial Naïve Bayes (MNB), Term Frequency-Inverse Document Frequency (TF-IDF), pseudo-labelling, and word association methods. The results show that the health protocol has been implemented in TAO of East Java province, the available facilities are good, and there is no change in reviews during the TAO pandemic. The Stuart-Kendall Tau-c value shows a weak relationship in a positive direction between rating and review sentiment. According to Haversine, Jaro Winkler, and Levenshtein, the data calculation indicates that web scraping data can complement tourism data for BPS-Statistics Indonesia.
Wages of Workers Spatial Analysis in Indonesia Region 2019 Maghfirah Maghfirah; Omas Bulan Samosir
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2021 No. 1 (2021): Proceedings of 2021 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2021i1.66

Abstract

The wage inequality of workers in Indonesia is one of the main problems and concerns that are important to be addressed by the government. The determination of the regional minimum wage by the local government has not been able to solve the problem of inequality. On a larger scale, the wage inequality of workers can affect the stability of the national economy. Research on the spatial analysis of workers' wages is very important to be carried out as a basis for making appropriate policies by the government. In this study, we have succeeded in analyzing the dependence and spatial relationship of a region with the wages of its workers and have identified the factors that affect the wages of workers in a region. The result reveals the spatial dependences are detected among districts, followed by the spatial clusters and spatial outliers through global and local spatial autocorrelation. Applying two spatial autoregressive models, spatial autoregressive lag model (SAL) and spatial autoregressive error model (SEM), SAL confirmed that there are 4 significant independent variables with a level of 10 percent and have a positive relationship, namely education, age, internet, and sex ratio variables. And SEM confirmed that there are significants 5 significant independent variables with a level of 10 percent and have a positive relationship, namely education, age, technology, internet, and sex ratio variables. As the policy implication, since regional inequality in term of wage is still a major issue, it will be a call for better coordination and cooperation within and between regions.

Page 2 of 16 | Total Record : 151