Recursive Journal of Informatics
Vol. 4 No. 1 (2026): March 2026

Implementation of Content-Based Filtering in Book Recommender Systems Using K-Nearest Neighbor Model with Singular Value Decomposition and Word2Vec

Naufal Afif Sadewa (Universitas Negeri Semarang)
Subhan Subhan (Universitas Negeri Semarang)



Article Info

Publish Date
31 Mar 2026

Abstract

Abstract. Books are a medium for understanding various topics, such as science, history, and culture. With the development of digital technology, accessing books has become easier, but choosing the right book among thousands of choices is a challenge. Book recommender systems are an efficient solution to help users find relevant books. One approach that can be used in book recommender systems is Content-Based Filtering, which utilizes content information in books to provide recommendations. Purpose: This research aims to develop a book recommender system by implementing Content-Based Filtering using K-Nearest Neighbor model with a combination of Singular Value Decomposition and Word2Vec in recommending relevant books according to each preference. Methods/Study design/approach: The method used involves several stages. First, data preprocessing is carried out to remove noise so as to retain important information. After data preprocessing, feature extraction uses the Term Frequency-Inverse Document Frequency method to represent book features through vectors. The result of this vector is then reduced in dimension using Singular Value Decomposition to reduce complexity and capture the most significant data structures. At another stage, book features are extracted using Word2Vec, which produces a semantic representation of the word in vector form. Next, the vector results from Singular Value Decomposition and Word2Vec are combined to form more informative features. Finally, the K-Nearest Neighbor model using the cosine similarity distance metric is used to calculate the similarity between books based on the combined features, so as to generate relevant book recommendations. Tests were conducted on the GoodReads Best Books dataset taken from Kaggle. Result/Findings: The evaluation results show that the proposed model is able to provide recommendations with good relevance values, measured using evaluation metrics such as Mean Average Precision and Normalized Discounted Cumulative Gain with the scores obtained respectively, namely 0.9637 and 0.9515 at parameter  is 5. Novelty/Originality/Value: The novelty in this research is the combination of Singular Value Decomposition vectors and Word2Vec vectors to produce a more informative feature representation, by utilizing statistical relationships between words and capturing the semantic meaning of words.

Copyrights © 2026






Journal Info

Abbrev

rji

Publisher

Subject

Computer Science & IT

Description

Recursive Journal of Informatics published by the Department of Computer Science, Universitas Negeri Semarang, a journal of Information Systems and Information Technology which includes scholarly writings on pure research and applied research in the field of information systems and information ...