Technological advancements in the era of the Industrial Revolution 4.0 have significantly transformed how society accesses and consumes information, particularly through online news portals. This study aims to analyze the relevance between news headlines and article content on Indonesian online news platforms by employing text mining techniques and similarity checking methods. To enhance the accuracy of relevance assessment, this research utilizes two deep learning-based modeling algorithms: Long Short-Term Memory (LSTM) and IndoBERT. The data was collected from three leading Indonesian news portals detik.com, kompas.com, and suara.com with a total of 52,242 articles from the entertainment and national news categories, gathered between July 1 and September 30, 2024. The dataset includes attributes such as headline, category, publication date, author, article URL, and news content. The research process consists of several stages, including data collection through web scraping, data pre-processing (which involves cleaning the category, author, and content columns), content summarization, text similarity calculation, and data labeling into three classes (relevan, berlebihan, and nonrelevan). Evaluation results show that the IndoBERT model outperforms LSTM, achieving the best performance with a training accuracy of 0.9048 and a training loss of 0.2514, as well as a validation accuracy of 0.8604 and a validation loss of 0.4039. These findings demonstrate that IndoBERT is effective in assessing the coherence between news headlines and content in today’s digital age.