Garuda - Garba Rujukan Digital

INTENSIF: Jurnal Ilmiah Penelitian dan Penerapan Teknologi Sistem Informasi

Vol 10 No 1 (2026)

Rianto, Rianto (Unknown)
Humanika, Eko Setyo (Unknown)
Untoro, Iwan Hartadi Tri (Unknown)

Publish Date
31 Jan 2026

Background: The distinction between standard and non-standard Indonesian sentences is traditionally well-defined, yet the ubiquity of digital communication has increasingly blurred these boundaries. This convergence introduces significant lexical ambiguity in formal contexts, complicating the performance of automated text classification systems. Objective: This study aims to enhance the robustness of Support Vector Machine (SVM) classification by addressing these linguistic irregularities through TF-IDF vectorization and a targeted directional augmentation strategy. Methods: A corpus comprising 5,394 labeled sentences was processed under a strict anti-leak grouping strategy to rigorously prevent semantic leakage between training, validation, and testing sets. To resolve decision boundary overlaps often missed by the baseline model, manual directional augmentation was applied, specifically targeting ambiguous sentence structures to enrich the training distribution and linguistic diversity. Results: The experiments demonstrated that directional augmentation significantly refined the model's decision margins. While the baseline model achieved a test accuracy of 94.39%, the augmented approach substantially improved generalization capabilities across unseen groups, elevating validation accuracy from 96.11% to 97.39% and test accuracy to 96.16%. Conclusion: These findings substantiate that structurally enriching the dataset effectively mitigates overfitting and improves sensitivity. However, given the scalability constraints of manual intervention, future research should prioritize automated augmentation techniques and contextual embeddings to handle deep linguistic nuances further.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

INTENSIF: Jurnal Ilmiah Penelitian dan Penerapan Teknologi Sistem Informasi

Website

Abbrev

intensif

Publisher

Universitas Nusantara PGRI Kediri

Subject

Computer Science & IT Decision Sciences, Operations Research & Management

Description

INTENSIF Journal is a publication container for research in various fields related to information systems. These fields includeInformation System, Software Engineering, Data Mining, Data Warehouse, Computer Networking, Artificial Intelligence, e-Bussiness, e-Government, Big Data, Application ...

Article Info

Abstract

Enhancing SVM-Based Classification Performance on Indonesian Sentences through TF-IDF and Directional Augmentation

Article Info

Abstract