TIN: TERAPAN INFORMATIKA NUSANTARA
Vol 6 No 8 (2026): January 2026

Perancangan Skema Evaluasi untuk Sistem Rekomendasi Berita Menggunakan Metrik Precision, Recall, dan F1‑Score

Phan, Irwan Kurnia (Unknown)
Yuricha, Yuricha (Unknown)



Article Info

Publish Date
22 Jan 2026

Abstract

Standardization of evaluation for news recommendation systems remains minimal, despite the importance of these systems in addressing information overload in the digital era. This research was designed to develop a comprehensive evaluation scheme for content-based news recommendation systems using five primary evaluation metrics: Precision, Recall, F1-Score, Hit Rate, and Mean Reciprocal Rank (MRR). The study utilized the News Category Dataset from HuffPost, which contains 209,527 news articles across 41 categories. Evaluation was conducted by simulating user feedback through three approaches: random baseline as a comparison reference, content-based filtering with TF-IDF, and Approximate Nearest Neighbor (ANN) based on Faiss. For the final evaluation, 10,000 randomly selected articles were used. Results demonstrate that TF-IDF achieved Precision@10 of 20.20%, Recall@10 of 0.57%, F1-Score@10 of 1.10%, and Hit Rate@10 of 69%, while ANN yielded Precision@10 of 11.50%, Recall@10 of 0.33%, F1-Score@10 of 0.63%, and Hit Rate@10 of 43%. The Hit Rate@10 metric shows that TF-IDF successfully provides at least one relevant article in 69% of queries, compared to ANN which achieves 43% and Random Baseline which only achieves 27%. TF-IDF surpasses ANN by 1.76 times in terms of Precision@10 (20.20% vs 11.50%) and 1.73 times in terms of Recall@10 (0.57% vs 0.33%). In terms of computational efficiency, TF-IDF achieves a runtime of 0.0100 seconds per recommendation, only 1.04 times faster than ANN which achieves 0.0104 seconds, showing a very minimal difference. The primary contribution of this research is a structured evaluation scheme using five complementary metrics that can be applied to various news recommendation systems and provides a framework for objective comparison among different methods.

Copyrights © 2026