International Journal Software Engineering and Computer Science (IJSECS)
Vol. 6 No. 1 (2026): APRIL 2026

ETL Pipeline with DTO Normalization for IPOS Data Integration in Spring Boot

Nugroho, Adhi Septian (Unknown)
Susetyo, Yeremia Alfa (Unknown)



Article Info

Publish Date
10 Apr 2026

Abstract

IPOS point-of-sale software, widely used by Indonesian small and medium retail enterprises (UMKM), exports transaction data as Excel files with no enforced schema—producing format-variable, multi-row receipt blocks with heterogeneous date representations, locale-dependent numeric formats, and embedded unit strings that resist conventional relational import. Transforming these unstructured exports into a relational database requires a structured architectural approach capable of handling format variability, type inconsistency, and record duplication. This study designs and implements a Spring Boot-based ETL (Extract, Transform, Load) service that applies the Data Transfer Object (DTO) pattern through ten purpose-specific DTO classes covering each pipeline phase, structured within a four-layer Model-View-Controller (MVC) architecture (Controller-Service-Repository-Entity). The Extractor employs a streaming Excel reader with dynamic column-layout detection based on header keywords, producing raw String-typed ExtractedReceipt and ExtractedItem DTOs. The Transformer applies six normalization steps via four utility classes—StringNormalizer, DateParser (seven date-format patterns), NumberParser (Indonesian and Western currency formats), and a HashSet-based duplicate detector—converting raw strings into typed ValidatedReceipt and ValidatedItem DTOs with explicit error logging. The Loader performs batch inserts per 1,000 records using pre-loaded duplicate sets for O(1) lookup. The pipeline operates asynchronously, returning a jobId immediately while processing continues on a background thread. Functional evaluation across ten scenarios yielded a 100% pass rate, covering valid files, invalid file types, date-format heterogeneity, embedded-unit quantity strings, Indonesian numeric formats, cross-file and intra-file duplicate detection, grand-total reconciliation tolerance, and product-variation tracking. Performance observation shows that files of 200–500 receipts complete within 5–15 seconds. These results indicate that a DTO-centric, explicitly mapped ETL pipeline over Spring Boot MVC provides a maintainable, auditable, and production-ready solution for UMKM retail data integration.

Copyrights © 2026






Journal Info

Abbrev

ijsecs

Publisher

Subject

Computer Science & IT

Description

IJSECS is committed to bridge the theory and practice of information technology and computer science. From innovative ideas to specific algorithms and full system implementations, IJSECS publishes original, peer-reviewed, and high quality articles in the areas of information technology and computer ...