Lingua Technica: Journal of Digital Literary Studies
Vol. 2 No. 1 (2026): Literature and computation: mapping, modeling, and mediation

Text mining and semantic modeling of literary corpora: a machine learning–based study of Indonesian fiction

Rinda Widya Ikomah (Unknown)
Zohaib Hassan Sain (Unknown)



Article Info

Publish Date
30 Jan 2026

Abstract

Background: The large-scale digitization of Indonesian literary works has produced extensive textual corpora that challenge conventional close-reading approaches and call for systematic, data-driven methods capable of capturing thematic, semantic, and affective patterns in fiction. Objective: This study aims to examine how text mining and semantic modeling can reveal lexical salience, intertextual relations, and narrative emotion in Indonesian fiction across different thematic orientations. Method: Using a quantitative corpus-based design, the study analyzes 36 Indonesian literary texts published between 1980 and 2022 through TF–IDF–based lexical analysis, document-level semantic embeddings with cosine similarity and clustering, and sentence-level sentiment analysis. Results: The findings show distinct lexical signatures that differentiate thematic clusters, coherent semantic groupings reflecting intertextual proximity, and sentiment trajectories dominated by neutral-to-negative polarity with strategically placed affective peaks across narrative progression. Implication: These results demonstrate that computational methods can empirically support literary analysis without displacing interpretive criticism. Novelty: The study integrates lexical, semantic, and affective modeling within a unified framework for Indonesian fiction, offering a scalable and replicable approach to digital literary studies.

Copyrights © 2026






Journal Info

Abbrev

lingtech

Publisher

Subject

Education Languange, Linguistic, Communication & Media Other

Description

Lingua Technica: Journal of Digital Literary Studies is a peer-reviewed international journal published by Asosiasi Relawan dan Pengelola Jurnal LPTNU (ARJUNU). This journal is dedicated to exploring the intersection of language, literature, and technology within the realm of digital ...