Nadya Elfareta Azarin
Universitas Halu Oleo

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Prediksi Popularitas Novel Berbasis Fitur-Fitur Teks Menggunakan Metode Random Forest Nadya Elfareta Azarin; Rizal Adi Saputra; Subardin Subardin
Jurnal Ilmiah Informatika Vol. 9 No. 1 (2024): Jurnal Ilmiah Informatika
Publisher : Department of Science and Technology Ibrahimy University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35316/jimi.v9i1.57-62

Abstract

In today's digital era, a novel's popularity is often measured by reader response and sales. This research aims to develop a novel popularity prediction model based on text features to provide insights to authors and publishers about the factors that influence reader acceptance. The method used in this research is Random Forest, a machine learning algorithm that can handle classification and regression well. The main goal of this research is to develop a predictive model that can identify key factors that contribute to the popularity of novels. The proposed method integrates text features, such as keyword extraction and sentiment analysis, in a Random Forest framework to predict popularity with high accuracy. The dataset used consists of various novel information, including title, genre, number of pages, and text features such as summary or description. Data is preprocessed to address issues such as missing values ​​and duplicates. Feature extraction is carried out by applying tokenization, stemming, and converting text into TF-IDF vectors. A Random Forest model was built incorporating these features, and the model parameters were optimized through a cross-validation process. The dataset used consists of various novel information, including title, genre, number of pages, and text features such as summary or description. Data is preprocessed to address issues such as missing values ​​and duplicates. Feature extraction is carried out by applying tokenization, stemming, and converting text into TF-IDF vectors. A Random Forest model was built incorporating these features, and the model parameters were optimized through a cross-validation process. The experimental results show that the Random Forest model is able to predict the popularity of novels with a satisfactory level of accuracy. Text features, such as keyword frequency and sentiment analysis, proved significant in their contribution to the predictive ability of the model. These findings provide valuable insight to authors and publishers in understanding reader preferences and the potential success of a novel.