Garuda - Garba Rujukan Digital

Sistemasi: Jurnal Sistem Informasi

Vol 15, No 5 (2026): Sistemasi: Jurnal Sistem Informasi

Farouq Mulya Al Simabua (Universitas Pembangunan Jaya)
Lathifah Alfat (Universitas Pembangunan Jaya)

Publish Date
26 May 2026

Automated text moderation systems on public service platforms are often exploited by manipulative spam messages from brokers offering illegal financial services. Previous text classification studies have frequently prioritized high accuracy metrics while overlooking the impact of data leakage caused by repetitive spam templates, a methodological flaw that can lead to severe model overfitting. This study aims to design and optimize a Natural Language Processing (NLP) model using the IndoBERT-Lite architecture to distinguish between organic user complaints and manipulative broker-generated comments. The proposed methodology focuses on extreme data deduplication, reducing 55,156 raw records into a balanced dataset containing 4,626 unique samples (57.1% organic and 42.9% spam). The training process was optimized using Gradient Accumulation and Early Stopping to ensure genuine model generalization capability. The evaluation results demonstrate that the optimized model successfully mitigated the initial overfitting problem, achieving both accuracy and F1-score values of 98% on unseen test data. These findings provide a reliable and data leakage–free automated moderation solution for internal digital customer service systems.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Sistemasi: Jurnal Sistem Informasi

Website

Abbrev

stmsi

Publisher

Universitas Islam Indragiri

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

Sistemasi adalah nama terbitan jurnal ilmiah dalam bidang ilmu sains komputer program studi Sistem Informasi Universitas Islam Indragiri, Tembilahan Riau. Jurnal Sistemasi Terbit 3x setahun yaitu bulan Januari, Mei dan September,Focus dan Scope Umum dari Sistemasi yaitu Bidang Sistem Informasi, ...

Article Info

Abstract

Optimization of IndoBERT-Lite Fine-Tuning for Spam Detection in Digital Customer Services

Article Info

Abstract