The widespread adoption of Learning Management Systems (LMS) in digital education has generated large volumes of student feedback in the form of unstructured free-text data, making manual analysis increasingly impractical. This study aims to identify the dominant themes emerging from student feedback on LMS platforms and to compare the performance of different Transformer-based embedding models in topic modeling tasks. The proposed approach employs BERTopic with three embedding models, namely IndoBERT, DistilBERT, and Sentence-BERT (SBERT). Student feedback data were collected from an institutional LMS and processed through text preprocessing, embedding generation, and topic modeling stages. Model performance was evaluated using multiple coherence metrics (c_v, c_npmi, u_mass, and c_uci), topic diversity, and the proportion of outlier documents. The results indicate that the IndoBERT-family model achieves the highest coherence scores, particularly in c_v and c_npmi, suggesting superior semantic consistency of the generated topics. DistilBERT produces the lowest proportion of outliers but yields a more limited number of topics, while SBERT demonstrates a balanced performance between topic quality and thematic diversity. These findings highlight that the choice of embedding model significantly influences the quality of topic modeling outcomes for Indonesian-language student feedback data.
Copyrights © 2025