Yusnita Mohd Ali
Universiti Teknologi MARA, Cawangan Pulau Pinang

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Fuzzy-based voiced-unvoiced segmentation for emotion recognition using spectral feature fusions Yusnita Mohd Ali; Alhan Farhanah Abd Rahim; Emilia Noorsal; Zuhaila Mat Yassin; Nor Fadzilah Mokhtar; Mohamad Helmy Ramlan
Indonesian Journal of Electrical Engineering and Computer Science Vol 19, No 1: July 2020
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v19.i1.pp196-206

Abstract

Despite abundant growth in automatic emotion recognition system (ERS) studies using various techniques in feature extractions and classifiers, scarce sources found to improve the system via pre-processing techniques. This paper proposed a smart pre-processing stage using fuzzy logic inference system (FIS) based on Mamdani engine and simple time-based features i.e. zero-crossing rate (ZCR) and short-time energy (STE) to initially identify a frame as voiced (V) or unvoiced (UV). Mel-frequency cepstral coefficients (MFCC) and linear prediction coefficients (LPC) were tested with K-nearest neighbours (KNN) classifiers to evaluate the proposed FIS V-UV segmentation. We also introduced two feature fusions of MFCC and LPC with formants to obtain better performance. Experimental results of the proposed system surpassed the conventional ERS which yielded a rise in accuracy rate from 3.7% to 9.0%. The fusion of LPC and formants named as SFF LPC-fmnt indicated a promising result between 1.3% and 5.1% higher accuracy rate than its baseline features in classifying between neutral, angry, happy and sad emotions. The best accuracy rates yielded for male and female speakers were 79.1% and 79.9% respectively using SFF MFCC-fmnt fusion technique.
Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients Yusnita Mohd Ali; Emilia Noorsal; Nor Fadzilah Mokhtar; Siti Zubaidah Md Saad; Mohd Hanapiah Abdullah; Lim Chee Chin
Indonesian Journal of Electrical Engineering and Computer Science Vol 28, No 2: November 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v28.i2.pp753-761

Abstract

Gender discrimination and awareness are essentially practiced in social, education, workplace, and economic sectors across the globe. A person manifests this attribute naturally in gait, body gesture, facial, including speech. For that reason, automatic gender recognition (AGR) has become an interesting sub-topic in speech recognition systems that can be found in many speech technology applications. However, retrieving salient gender-related information from a speech signal is a challenging problem since speech contains abundant information apart from gender. The paper intends to compare the performance of human vocal tract-based model i.e., linear prediction coefficients (LPC) and human auditory-based model i.e., Mel-frequency cepstral coefficients (MFCC) which are popularly used in other speech recognition tasks by experimentation of optimal feature parameters and classifier’s parameters. The audio data used in this study was obtained from 93 speakers uttering selected words with different vowels. The two feature vectors were tested using two classification algorithms namely, discriminant analysis (DA) and artificial neural network (ANN). Although the experimental results were promising using both feature parameters, the best overall accuracy rate of 97.07% was recorded using MFCC-ANN techniques with almost equal performance for male and female classes.