Heriyanto Heriyanto
Prodi Teknik Informatika Universitas Pembangunan Nasional "Veteran" Yogyakarta

Published : 23 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 23 Documents
Search

Implementation of Mel-Frequency Cepstral Coefficient As Feature Extraction Method On Speech Audio Data Marbun, Andre Julio; Heriyanto; Kodong, Frans Richard
Telematika Vol 21 No 3 (2024): Edisi Oktober 2024
Publisher : Jurusan Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31315/telematika.v21i3.12339

Abstract

Sounds cannot be directly processed by machines without a feature extraction process being carried out first. Currently, there are so many choices of feature extraction methods that can be used, so determining the right feature extraction method is not easy. One method of feature extraction on sound signals that is often used is Mel-Frequency Cepstral Coefficient (MFCC). MFCC has a working principle that resembles the human hearing system, which causes it to be widely used in various tasks related to recognition based on sound signals. This research will use the MFCC method to extract characteristics on voice signals and Support Vector Machine as a method of emotion classification on the RAVDESS dataset. MFCC consists of several stages, namely Pre-emphasize, Frame Blocking, Windowing, Fast Fourier Transform, Mel-Scaled Filterbank, Discrete Cosine Transform, and Cepstral Liftering. The type of test design that will be carried out in this research is parameter tuning. Parameter tuning is carried out with the aim of obtaining parameters that produce the best accuracy in the machine learning model. The parameters that will be tuned include the α value in the Pre-Emphasis process, frame length and overlap length in the Frame Blocking process, the number of mel filters in the Mel-Scaled Filterbank process, the number of cepstral coefficients in the Discrete Cosine Transform process and the C value in SVM. The best accuracy in males of 85.71% was obtained with a combination of filter parameter pre-emphasize of 0.95, frame length of 0.023 ms, overlap of adjacent frames of 40%, number of mel filters in the mel-scaled filterbank process of 24 mel, number of cepstral coefficient of 24 coefficient and the value of 'C' in SVM of 0.01. The best accuracy in women of 92.21% was obtained with a combination of filter parameters pre-emphasize of 0.95, frame length of 0.023 ms, overlap of adjacent frames of 40%, the number of mel filters in the melscaled filterbank process of 24 mel, and the number of cepstral coefficient of 13 coefficient and 'C' value in SVM of 0.01. From the two test results of tuning parameters between men and women, there are similar parameter values in all test parameters, except for the number of cepstral coefficients. The number of cepstral coefficient in men is 24 coefficient while the number of cepstral coefficient in women is 13 coefficient. Based on the research conducted, there are the following conclusions, the combination of MFCC and SVM methods can be used for emotion classification based on input data in the form of voice intonation with an accuracy of 85.71% in men and 92.21% in women. The difference in accuracy obtained between male and female models is due to the different data used. Male models are trained with male voice data and female models are trained with female voice data, this is done because men and women have different voice frequency ranges.
PENDAMPINGAN UMKM KWT SUKA MAJU UNTUK MENINGKATKAN PRODUKSI DAN PEREKONOMIAN MASYARAKAT DUSUN PALIHAN Heriyanto, Heriyanto; Fauziah, Yuli; Irawati, Dyah Ayu
Dharma: Jurnal Pengabdian Masyarakat Vol. 1 No. 2 (2020): November
Publisher : Universitas Pembangunan Nasional "Veteran" Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31315/dlppm.v1i2.4043

Abstract

The SUKAMAJU Women's Farmer Group (KWT) is a group of women craftsmen of banana tree processing. During the Covid-19 pandemic, sales and marketing of processed banana food were very limited. Online marketing in times of the Covid-19 pandemic is urgently needed and requires support. Community service from UPN Veteran Yogyakarta, in this case, is programmed to help solve problems during the pandemic. Marketing through the internet and social media is very much needed, while the ability of mothers to master social media and the internet is very limited. The service team from UPN Veteran Yogyakarta is trying to help with solutions going into the field to help provide full assistance and also assistance for production equipment so that food processing craftsmen maintain production in KWT. The hope of the community service team is that there will be an increase in sales results by providing full assistance in both marketing media and increasing production equipment with an average increase of 8-9 pieces per day.
Implementation of Mel-Frequency Cepstral Coefficient as Feature Extraction using K-Nearest Neighbor for Emotion Detection Based on Voice Intonation Nawasta, Revanto Alif; Cahyana, Nur Heri; Heriyanto, Heriyanto
Telematika Vol 20 No 1 (2023): Edisi Februari 2023
Publisher : Jurusan Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31315/telematika.v20i1.9518

Abstract

Purpose: To determine emotions based on voice intonation by implementing MFCC as a feature extraction method and KNN as an emotion detection method.Design/methodology/approach: In this study, the data used was downloaded from several video podcasts on YouTube. Some of the methods used in this study are pitch shifting for data augmentation, MFCC for feature extraction on audio data, basic statistics for taking the mean, median, min, max, standard deviation for each coefficient, Min max scaler for the normalization process and KNN for the method classification.Findings/result: Because testing is carried out separately for each gender, there are two classification models. In the male model, the highest accuracy was obtained at 88.8% and is included in the good fit model. In the female model, the highest accuracy was obtained at 92.5%, but the model was unable to correctly classify emotions in the new data. This condition is called overfitting. After testing, the cause of this condition was because the pitch shifting augmentation process of one tone in women was unable to solve the problem of the training data size being too small and not containing enough data samples to accurately represent all possible input data values.Originality/value/state of the art: The research data used in this study has never been used in previous studies because the research data is obtained by downloading from Youtube and then processed until the data is ready to be used for research.