Effective management of incoming mail administration is a crucial factor in improving performance and service delivery in government agencies. However, manual processing of incoming mail is often inefficient due to the ever-increasing volume of data and the diverse content, which can make archiving, data retrieval, and decision-making difficult. Therefore, a method capable of automatically grouping incoming mail data is needed. One data mining technique that can be used is K-Means clustering. This study aims to group incoming mail at the Medan City Communications and Informatics Office based on content similarity. The research process involved several stages: text preprocessing, including cleaning, tokenization, stopword removal, and stemming. Then, weighting was performed using the TF-IDF method, followed by clustering with the K-Means algorithm. Data processing was performed using the Python programming language on the Google Colaboratory (Google Colab) platform. The results showed that the incoming mail data could be grouped into three clusters. The first cluster, 3.9%, contains letters related to planning and strategic document preparation; the second cluster, 85.9%, is a group of personnel administration letters, specifically regarding the appointment to functional positions; and the third cluster, 10.2%, contains letters related to operational and routine agency activities. The results of this grouping indicate that most incoming letters are dominated by personnel administration. Thus, applying the K-Means Clustering method can help systematically group incoming letters and support more effective, efficient archive management.
Copyrights © 2026