Keyword extraction is a significant technique in natural language processing (NLP) that serves to summarize the essence of a document, such as a scientific journal summary. This study aims to analyze the effectiveness of two keyword extraction methods, namely Term Frequency-Inverse Document Frequency (TF-IDF) and KeyBERT, in finding significant keywords from a collection of scientific journal abstracts. The dataset used consists of several scientific journal abstracts accompanied by manual keywords as a basis for assessment. The TF-IDF method relies on the frequency of words in the document, while KeyBERT utilizes a cosine similarity approach based on the BERT transformer model to determine the most meaningful keywords. The research findings show that the KeyBERT method and the TF-IDF method have a moderate level of similarity with semantic similarity values of 0.578 for the KeyBERT method and 0.469 for the TF-IDF method, respectively. These results show significant potential for the use of machine learning and deep learning-based models with both methods for topic classification systems, especially in the fields of information retrieval and text mining.
Copyrights © 2025