Journal of Applied Data Sciences
Vol 7, No 1: January 2026

SiMoI New Method to Solve the Sparsity Problem in Collaborative Filtering

Kurniawan, Hendra (Unknown)
Lestari, Sri (Unknown)
Saleh, Sushanty (Unknown)
Satrio, Rafli Banu (Unknown)



Article Info

Publish Date
19 Dec 2025

Abstract

Sparsity data is a major challenge in collaborative recommendation systems, characterized by the predominance of missing values within the user-item matrix. When a substantial portion of data is unavailable, the estimation process becomes hindered, and prediction accuracy declines due to limited usable information. To address this issue, this study introduces a novel method called SiMoI (Similarity, Mode, and Minimum Imputation), which is adaptively designed to handle high levels of sparsity. The SiMoI method combines user similarity with imputation strategies based on mode and minimum values. By leveraging subsets of the most informative users and items, the method efficiently fills missing entries while maintaining prediction stability. Evaluation was conducted using both real and synthetic datasets with varying sizes and degrees of sparsity, including an extreme scenario with 93.7% missing data. Experimental results show that SiMoI consistently produces more accurate predictions than baseline methods. Under high-sparsity conditions, SiMoI achieved an RMSE as low as 0.823, outperforming KNNI (0.947) and MEAN (1.021). Moreover, SiMoI demonstrated resilience across different data scales and sparsity distributions, indicating its flexibility and scalability in diverse contexts. These findings suggest that SiMoI is an effective and stable approach for addressing sparsity and holds strong potential for implementation in user-based recommendation systems, particularly in real-world scenarios where data availability is frequently limited.

Copyrights © 2026






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...