JISIP: Jurnal Ilmu Sosial dan Pendidikan
Vol 5, No 4 (2021): JISIP (Jurnal Ilmu Sosial dan Pendidikan)

Deep Neural Network for Speaker Identification Using Static and Dynamic Prosodic Feature for Spontaneous and Dictated Data

Rahman, Arifan (Unknown)
Wibowo, Wahyu Catur (Unknown)



Article Info

Publish Date
02 Nov 2021

Abstract

We can recognize a person by his voice alone. In principle, the sound has a tone (pitch) that is different for each person. This study aims to measure a Deep Neural Network (DNN) performance with static and dynamic prosodic features. Prosodic is information about sound related to tone, intonation, pressure, duration, and rhythm of a person's pronunciation. The data used is dictated and spontaneous voice data that taken from YouTube. The data used consists of three male voices and one female voice. The data is segmented into various duration, 3 seconds, 5 seconds, and 10 seconds. After the data has been segmented, the static prosodic features with 103 dimensions will be extracted and the dynamic prosodic features with 13 dimensions will be extracted too. Each feature and feature combination will be trained and tested using DNN with a ratio of 90:10. The result shows that the 10 seconds segmented data has higher accuracy than the others. Accuracy of static prosodic features is better than dynamic prosodic features. The average accuracy of DNN for static prosodic features is 87.02%. The average accuracy of DNN for dynamic prosodic features is 72.97%. The average accuracy of DNN for combined static and dynamic prosodic features is 87.72%.

Copyrights © 2021






Journal Info

Abbrev

JISIP

Publisher

Subject

Arts Humanities Education Law, Crime, Criminology & Criminal Justice Social Sciences

Description

Jurnal Ilmiah Ilmu Sosial dan Pendidikan merupakan kumpulan artikel ilmiah ilmu sosial dan pendidikan berdasarkan hasil penelitian dan hasil kajian pustaka. Jurnal ini menggunakan Bahasa Indonesia. Terbit 3 kali setiap tahun pada bulan Maret, Juli, dan ...