Majority of people now search the internet for news or information topics. The growth of the internet and social media has led to the emergence of hundreds of portals or online news with very diverse news topics. Searching for headlines manually is an ineffective and time-consuming method. In this study headlines modeling was used using Latent Dirichlet Allocation (LDA). Prior to the application of the LDA model, supporting processes such as tokenization, lemmatization, tf-idf factorization and non-negative matrix factorization were also applied. The results showed that the LDA can be applied to model the news topic well with a loglikelihood score of -13615,912 and a perplexity score of 378,958. In addition to using LDA, topic modeling is also done in the form of clusters by applying k-means clustering. With the elbow method, the ideal number of clusters for k-means clustering is 5 clusters and the silhouette performance is 0.62
Copyrights © 2020