Background: Previous translation systems for the Lampung dialect of nyo to Indonesian achieved bilingual evaluation understudy (BLEU) scores below 40%, primarily due to challenges in processing affixed words. Objective: This research aims to perform stemming on affixed words in the Lampung dialect of nyo to enhance the performance of the translation system. Methods: We developed an n-gram stemming approach that reduces affixed words to their base forms by measuring similarity between n-grams using the Dice coefficient method. When similarity exceeds a specified threshold, the system identifies the corresponding base word. Results: Using a dataset of 700 words from the Lampung dialect of nyo, we constructed a comprehensive stemmer covering all affix variations. The optimal threshold was determined to be 0.5, achieving bigram accuracy of 93.86% and trigram accuracy of 89.14%. These accuracy levels demonstrate the method's effectiveness in identifying base word forms, which directly impacts translation quality improvement. Conclusion: N-gram stemming with a 0.5 threshold effectively processes the Lampung dialect of nyo morphology and shows potential for enhancing translation accuracy. This work represents the first comprehensive stemming system specifically designed for the Lampung dialect of nyo, contributing to the development of natural language processing tools for underrepresented regional languages in Indonesia.
Copyrights © 2026