Claim Missing Document
Check
Articles

Found 2 Documents
Search

Penerapan Data Pipeline untuk Meningkatkan Efisiensi Penghitungan Indeks Perkembangan Harga (IPH) di Indonesia Sandyawan, Ignatius; Rimawati, Yeni; Suarjaya, I Made Oka
Seminar Nasional Official Statistics Vol 2024 No 1 (2024): Seminar Nasional Official Statistics 2024
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/semnasoffstat.v2024i1.2045

Abstract

The Price Change Index (PCI) is an indicator for monitor market price fluctuations. In 2022, Statistics Indonesia (BPS), worked with Ministry of Home Affairs (Kemendagri) as data steward and the Ministry of Trade (Kemendag) as a data collector, to calculate the PCI for 20 primary commodities across all cities in Indonesia. Weekly price data was collected by city trade departments and transferred to BPS for processing. Until late 2023, this process, done in Microsoft Excel, could take up to three days and was prone to errors. This research focuses on implementing a data pipeline for PCI calculations, automating tasks like data cleaning, index calculation, and visualization. Results showed the data pipeline reduced calculation time to just 16 minutes while maintaining consistency with manually obtained PCI values. The implementation has significantly improved time efficiency, minimized errors, and optimized resource use.
A Hybrid Method for Standardising Civil Registration and Vital Statistics (CRVS) Location Data Sandyawan, Ignatius; Rimawati, Yeni; Rismansyah, Ari
Proceedings of The International Conference on Data Science and Official Statistics Vol. 2025 No. 1 (2025): Proceedings of 2025 International Conference on Data Science and Official St
Publisher : Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34123/icdsos.v2025i1.618

Abstract

 Civil Registration and Vital Statistics (CRVS) systems in archipelagic contexts likeIndonesia face persistent challenges in location data standardisation due to free-text entries thatvary in spelling, formatting, and granularity. This study introduces a multi-stage hybridframework that systematically converts these unstructured entries into official administrativecodes using deterministic matching, fuzzy probabilistic matching, and geocoding. This studyprocessed 841,126 birth and death records using Python (Pandas, RapidFuzz, Geopy).Cumulatively, all stages achieved a combined match rate of 85.44% for births and 67.12% fordeaths. The layered pipeline ensured speed, precision, and coverage for real-world CRVS data.The findings demonstrate enhanced geographic precision in vital statistics, enabling morereliable public health and demographic applications. Future improvements may includetransformer-based embeddings, active learning for ambiguous records, and uncertainty-awaregeocoding techniques. This framework establishes a scalable, robust pathway for elevating thegranularity and reliability of geolocated vital event data.