Javanese, a regional language in Indonesia spoken by over 100 million people, is classified as a low-resource language, presenting significant challenges in the development of effective speech recognition systems due to limited linguistic resources and data. Furthermore, the presence of noise is a significant factor that impacts the performance of speech recognition systems. This study aims to develop a speech recognition model for the Javanese language, focusing on a syllable-based approach using Mel Frequency Cepstral Coefficients (MFCC) for audio feature extraction and Convolutional Neural Networks (CNNs) methods for classification. Additionally, it will analyze how different types of colored noise: white gaussian, pink, and brown, when added to the audio, impact the model's accuracy. The results showed that the proposed method reached a peak accuracy of 81% when tested on the original audio (audio without any synthetic noise added). Moreover, in noisy audio, model accuracy improves as noise levels decrease. Interestingly, with brown noise at a 20 dB SNR, the model's accuracy slightly increases to 83%, representing a 2.47% improvement over the original audio. These results demonstrate that the proposed syllable-based method is a promising approach for real-world applications in Javanese speech recognition, and the slight accuracy improvement in noisy conditions suggests potential regularization effects
Copyrights © 2025