Garuda - Garba Rujukan Digital

GSE-journal

Vol. 3 No. 3 (2025): Vol. 3 No. 3, November 2025

Nafi Annury, Muhammad (Unknown)
Sutrisno, Djoko (Unknown)

Publish Date
29 Nov 2025

Authorship attribution (AA), a core task in computational linguistics, seeks to identify the author of a text based on stylistic patterns. While effective, many existing methods face a trade-off between classification accuracy and computational cost, especially when applied to large datasets. This study provides a systematic evaluation of word-level string kernel techniques as a highly efficient and accurate solution for AA. We investigate the performance of three string kernels (Spectrum, Presence Bits, and Intersection) paired with three machine learning classifiers (Support Vector Machine, Random Forest, and XGBoost). The models were tested on three distinct feature sets designed to isolate the stylistic contribution of noun phrases alongside word (n)-grams. Our findings reveal that the optimal configuration—a Support Vector Machine with a Spectrum kernel utilizing a feature set of word (n)-grams and noun phrases—achieves approximately 95% classification accuracy on the test set. This result underscores the critical role of phrasal-level syntactic information in capturing an author's unique voice. Most significantly, this word-level approach demonstrates a four- to six-fold reduction in model training time compared to a strong character-level baseline, while maintaining superior or competitive accuracy. This research concludes that word-level string kernels offer a powerful and practical framework for authorship attribution, striking an exceptional balance between high performance and computational efficiency. The method's scalability makes it highly suitable for real-world applications, including digital forensics, plagiarism detection, and large-scale textual analysis

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

GSE-journal

Website

Abbrev

gse

Publisher

PT Mutiara Intelektual Indonesia

Subject

Humanities Education Social Sciences Other

Description

Global Synthesis in Education(GSE) is an interdisciplinary publication dedicated to publishing original research and written works on education for international audiences of educational researchers. The Global Synthesis in Education Journal aims to provide a scholarly forum for understanding the ...

Article Info

Abstract

TUGAS dan FUNGSI PENGAWAS MUTU SIPIL (Civil QC Inspector) DALAM PENGERJAAN PEMBANGUNAN PELABUHAN (JETTY) PROYEK SUMBAWA LNG REGASDI PT. JGC INDONESIA

Article Info

Abstract