Garuda - Garba Rujukan Digital

Scientific Journal of Informatics

Vol. 9 No. 1 (2024): Jurnal Ilmiah Informatika

Nadya Elfareta Azarin (Universitas Halu Oleo)
Rizal Adi Saputra (Universitas Halu Oleo)
Subardin Subardin (Universitas Halu Oleo)

Publish Date
17 May 2025

In today's digital era, a novel's popularity is often measured by reader response and sales. This research aims to develop a novel popularity prediction model based on text features to provide insights to authors and publishers about the factors that influence reader acceptance. The method used in this research is Random Forest, a machine learning algorithm that can handle classification and regression well. The main goal of this research is to develop a predictive model that can identify key factors that contribute to the popularity of novels. The proposed method integrates text features, such as keyword extraction and sentiment analysis, in a Random Forest framework to predict popularity with high accuracy. The dataset used consists of various novel information, including title, genre, number of pages, and text features such as summary or description. Data is preprocessed to address issues such as missing values and duplicates. Feature extraction is carried out by applying tokenization, stemming, and converting text into TF-IDF vectors. A Random Forest model was built incorporating these features, and the model parameters were optimized through a cross-validation process. The dataset used consists of various novel information, including title, genre, number of pages, and text features such as summary or description. Data is preprocessed to address issues such as missing values and duplicates. Feature extraction is carried out by applying tokenization, stemming, and converting text into TF-IDF vectors. A Random Forest model was built incorporating these features, and the model parameters were optimized through a cross-validation process. The experimental results show that the Random Forest model is able to predict the popularity of novels with a satisfactory level of accuracy. Text features, such as keyword frequency and sentiment analysis, proved significant in their contribution to the predictive ability of the model. These findings provide valuable insight to authors and publishers in understanding reader preferences and the potential success of a novel.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Scientific Journal of Informatics

Website

Abbrev

JIMI

Publisher

Universitas Ibrahimy

Subject

Computer Science & IT

Description

Topics cover the following areas (but are not limited to): 1. Information Technology (IT) a. Software engineering b. Game c. Information Retrieval d. Computer network e. Telecommunication f. Internet g. Wireless technology h. Network security i. Multimedia technology j. Mobile Computing k. ...

Article Info

Abstract

Prediksi Popularitas Novel Berbasis Fitur-Fitur Teks Menggunakan Metode Random Forest

Article Info

Abstract