Sistemasi: Jurnal Sistem Informasi
Vol 15, No 5 (2026): Sistemasi: Jurnal Sistem Informasi

Optimization of IndoBERT-Lite Fine-Tuning for Spam Detection in Digital Customer Services

Farouq Mulya Al Simabua (Universitas Pembangunan Jaya)
Lathifah Alfat (Universitas Pembangunan Jaya)



Article Info

Publish Date
26 May 2026

Abstract

Automated text moderation systems on public service platforms are often exploited by manipulative spam messages from brokers offering illegal financial services. Previous text classification studies have frequently prioritized high accuracy metrics while overlooking the impact of data leakage caused by repetitive spam templates, a methodological flaw that can lead to severe model overfitting. This study aims to design and optimize a Natural Language Processing (NLP) model using the IndoBERT-Lite architecture to distinguish between organic user complaints and manipulative broker-generated comments. The proposed methodology focuses on extreme data deduplication, reducing 55,156 raw records into a balanced dataset containing 4,626 unique samples (57.1% organic and 42.9% spam). The training process was optimized using Gradient Accumulation and Early Stopping to ensure genuine model generalization capability. The evaluation results demonstrate that the optimized model successfully mitigated the initial overfitting problem, achieving both accuracy and F1-score values of 98% on unseen test data. These findings provide a reliable and data leakage–free automated moderation solution for internal digital customer service systems.

Copyrights © 2026






Journal Info

Abbrev

stmsi

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

Sistemasi adalah nama terbitan jurnal ilmiah dalam bidang ilmu sains komputer program studi Sistem Informasi Universitas Islam Indragiri, Tembilahan Riau. Jurnal Sistemasi Terbit 3x setahun yaitu bulan Januari, Mei dan September,Focus dan Scope Umum dari Sistemasi yaitu Bidang Sistem Informasi, ...