Regional languages are vital for communication and preserving cultural identity, safeguarding local heritage. However, globalization and modernization endanger their existence as they are increasingly replaced by national or global languages. Despite progress in dialect recognition research, particularly for certain languages, further studies are needed to improve model performance and address less-represented dialects, including those in Indonesia. This study enhances a custom-built dataset for dialect recognition through the application of data augmentation techniques, specifically adding noise, time stretching, and pitch shifting. Using Mel-frequency cepstral coefficients (MFCC) for feature extraction, it evaluates the performance of convolutional neural network (CNN) and multilayer perceptron (MLP) in classifying six Indonesian dialects. Results indicate that CNN outperformed, achieving 97.92% accuracy, 97.90% recall, 97.97% precision, 97.92% F1-score, and a kappa score of 97.49% with combined augmentation techniques, setting a foundation for further research.
Copyrights © 2025