The rapid growth of scientific publications in Indonesia has created a need for text analysis methods capable of automatically clustering articles based on content similarity and research themes. This study aims to implement a combination of Term Frequency Inverse Document Frequency (TF-IDF) and the K-Means in the process of grouping scientific journal abstracts in the field of informatics. The research data consist of 1,200 scientific journal abstracts manually collected from the official SINTA (Science and Technology Index) portal for the 2023”“2024 publication period, covering various levels of national journal accreditation. The study employs an unsupervised machine learning approach consisting of several stages, including text preprocessing, TF-IDF weighting, clustering using K-Means, and result evaluation using the Silhouette Score and Davies”“Bouldin Index (DBI) metrics. The TF-IDF weighting process produced 3,000 of the most informative terms, dominated by keywords such as data, method, result, and system, reflecting the research characteristics in the field of informatics. The clustering process generated four main clusters with a Silhouette Score of 0.0121 and a DBI value of 8.3996, indicating that the model was able to identify initial thematic similarities among abstracts. The Word Cloud visualization revealed variations in research topic focus across clusters, including algorithm testing, data model development, system applications, and methodological implementation. This study contributes to the development of a national framework for scientific text analysis that can be utilized for research topic mapping, inter-institutional collaboration, and data-driven research policy formulation.
Copyrights © 2026