Jurnal Komtika (Komputasi dan Informatika)
Vol. 10 No. 1 (2026)

Pengelompokkan Abstrak Jurnal Ilmiah Menggunakan Term Frequency-Inverse Document Frequency dan K-Means

Nadia Wati Aprianti (Universita Ahmad Dahlan)
Herman Yuliansyah (Universitas Ahmad Dahlan)
Muhammad Kunta Biddinika (Universitas Ahmad Dahlan)



Article Info

Publish Date
30 May 2026

Abstract

  The rapid growth of scientific publications in Indonesia has created a need for text analysis methods capable of automatically clustering articles based on content similarity and research themes. This study aims to implement a combination of Term Frequency Inverse Document Frequency (TF-IDF) and the K-Means in the process of grouping scientific journal abstracts in the field of informatics. The research data consist of 1,200 scientific journal abstracts manually collected from the official SINTA (Science and Technology Index) portal for the 2023”“2024 publication period, covering various levels of national journal accreditation. The study employs an unsupervised machine learning approach consisting of several stages, including text preprocessing, TF-IDF weighting, clustering using K-Means, and result evaluation using the Silhouette Score and Davies”“Bouldin Index (DBI) metrics. The TF-IDF weighting process produced 3,000 of the most informative terms, dominated by keywords such as data, method, result, and system, reflecting the research characteristics in the field of informatics. The clustering process generated four main clusters with a Silhouette Score of 0.0121 and a DBI value of 8.3996, indicating that the model was able to identify initial thematic similarities among abstracts. The Word Cloud visualization revealed variations in research topic focus across clusters, including algorithm testing, data model development, system applications, and methodological implementation. This study contributes to the development of a national framework for scientific text analysis that can be utilized for research topic mapping, inter-institutional collaboration, and data-driven research policy formulation.

Copyrights © 2026






Journal Info

Abbrev

komtika

Publisher

Subject

Computer Science & IT Engineering

Description

Aims Jurnal Komtika (Komputasi dan Informatika) is a scientific journal published by the Faculty of Engineering, Universitas Muhammadiyah Magelang and is Accredited by the Ministry for Research, Technology, and Higher Education (RISTEKDIKTI)(No:200/M/KPT/2020). It is a medium for researchers, ...