Increasing efficiency and relevance in searching for news information is a pressing need in the digital era. This study aims to develop a news title ranking system based on keywords (queries) by combining the Term Frequency-Inverse Document Frequency (TF-IDF) and cosine similarity methods. The data used are 2,507 news titles from four of the most popular news sites in Indonesia, namely Kompas.com, Detik.com, CNNIndonesia.com, and Tempo.com in the last one year. The stages carried out include web scraping, pre-processing (case folding, tokenizing, stopwords removal, and stemming), word weighting using TF-IDF, similarity calculation using cosine similarity, to system performance evaluation with accuracy, precision, recall, and f1-score metrics. The test results on three different queries show that the system is able to provide very good results with an average accuracy of 99.75%, precision 96.67%, recall 100%, and f1-score 98.33%. This study shows that the combination of TF-IDF and cosine similarity is effective in optimizing the search for news titles that are relevant to the entered query.
Copyrights © 2025