This study aims to analyze mental health expressions in Indonesian-language YouTube comments using a text mining approach and the K-Means clustering algorithm. The increasing use of social media as a platform for expressing psychological conditions has resulted in large volumes of unstructured textual data that are difficult to analyze manually. Therefore, this study applies text preprocessing techniques, including case folding, tokenization, stopword removal, and stemming, followed by Term Frequency–Inverse Document Frequency (TF-IDF) weighting to transform textual data into numerical representations. The clustering process is performed using the K-Means algorithm, and the optimal number of clusters is determined using the Elbow Method and Silhouette Coefficient. The results show that the optimal number of clusters is k = 3, with the highest Silhouette Coefficient value indicating good cluster quality. A total of 2,411 YouTube comments were successfully grouped into three clusters, representing different types of mental health expressions, namely complaint expressions, personal experience narratives, and general responses. This study contributes by providing a social media comment clustering model to analyze mental health expressions in the Indonesian digital context. The results demonstrate that the K-Means algorithm can effectively identify meaningful patterns in large-scale textual data without requiring labeled datasets, making it useful for supporting data-driven mental health analysis.
Copyrights © 2026