Mahanani, Prananing
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Novel Genre Classification based on Synopsis using the Random Forest Algorithm Mahanani, Prananing; Fibriani (SCOPUS ID=57192643331), Charitas
Sistemasi: Jurnal Sistem Informasi Vol 15, No 1 (2026): Sistemasi: Jurnal Sistem Informasi
Publisher : Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32520/stmsi.v15i1.5815

Abstract

Novel genre classification based on synopses presents a significant challenge in text processing, as each genre exhibits distinct lexical characteristics. This study evaluates the performance of the Random Forest algorithm in classifying novel genres under conditions of imbalanced data distribution. The research stages include text preprocessing—comprising case folding, tokenization, stopword removal, and stemming—feature extraction using Term Frequency–Inverse Document Frequency (TF-IDF), and model training with Random Forest. In addition, manual data balancing was applied by increasing samples in minority classes through simple oversampling. The model was evaluated using accuracy metrics and confusion matrix analysis. The results indicate that Random Forest is able to identify most genres with moderate accuracy, particularly for classes with larger data volumes. The initial model achieved an accuracy of 42.11%, which increased to 46.67% after the application of data balancing. Misclassification primarily occurred in genres with limited samples that share similar vocabulary with dominant genres. These findings demonstrate that Random Forest can still be applied to synopsis-based novel genre classification without fully relying on balancing techniques. However, performance remains uneven across classes, highlighting the need for per-genre analysis to obtain a more comprehensive evaluation.