The rapidly growing game industry makes it difficult for players to find games that match their preferences. Conventional recommendation methods are often unsatisfactory due to a lack of personalization. This study aims to design and build a web-based game recommendation system using the content-based filtering method by leveraging a fine-tuned Transformer embedding model, all-MPNet-base-v2, to deeply analyze the textual content of games. The research methodology included data collection from the Steam API (43,900 games), text preprocessing with TF-IDF for keyword extraction, and significantly, fine-tuning the all-MPNet-base-v2 model using the Knowledge Distillation method with jina-embedding-v3 as the teacher model. A novel game series identification feature using fuzzy string matching was also implemented. The resulting embedding vectors were indexed using LanceDB and deployed in a Flask web application. The research contributions are the successful domain-specific adaptation of MPNet via Knowledge Distillation and the implementation of the series identification feature. Quantitative evaluation demonstrated the fine-tuned model's superiority, achieving substantial improvements over the baseline in MRR@10 (0.5857), MAP@10 (0.5149), and Hit Rate@3 (0.90). User Acceptance Testing (UAT) with 15 respondents showed high acceptance (92.89%). Limitations include the Steam-only dataset, potential information loss from TF-IDF, and the small UAT sample size. This study confirms that fine-tuned Transformer embeddings within a content-based framework, enhanced by Knowledge Distillation, can produce effective, accurate, and well-received game recommendations, further improved by context-aware features like series identification.
Copyrights © 2025