One of the tasks in forensic linguistics, especially forensic phonetics, is evaluating the speech sounds in the recordings. The speech evaluation aims at identifying and verifying speakers to predict if the sound were spoken by the suspect or not. The common problem in the task is determining which acoustic features of the speech sounds are reliable for the speaker identification and verification. The purpose of this research is studying formant frequencies to predict high vowel sounds /i/, and /u/ by using artificial neural network (ANN). Using three various normalization methods (i.e., softmax, z-score and sigmoid), we utilized multilayer perceptron on backpropagation ANN with the architectural models of 4-5-2, 4-10-2 and 4-20-2. The results show that the z-score normalization method provides higher accuracy than the other two in all formations and the 4-10-2 formation has shown the highest accuracy (92.26%).
Copyrights © 2024