Claim Missing Document
Check
Articles

Found 2 Documents
Search

Overlapped music segmentation using a new effective feature and random forests Duraid Y. Mohammed; Khamis A. Al-Karawi; Philip Duncan; Francis F. Li
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 8, No 2: June 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v8.i2.pp181-189

Abstract

In the field of audio classification, audio signals may be broadly divided into three classes: speech, music and events. Most studies, however, neglect that real audio soundtracks can have any combination of these classes simultaneously. This can result in information loss, thus compromising the knowledge discovery. In this study, a novel feature, “Entrocy”, is proposed for the detection of music in both pure form and overlapping with the other audio classes. Entrocy is defined as the variation of the information (or entropy) in an audio segment over time. Segments, which contain music, were found to have lower Entrocy since there are fewer abrupt changes over time. We have also compared Entrocy with existing music detection features and the entrocy showing a good performance.
Robust speaker verification by combining MFCC and entrocy in noisy conditions Duraid Y. Mohammed; Khamis Al-Karawi; Ahmed Aljuboori
Bulletin of Electrical Engineering and Informatics Vol 10, No 4: August 2021
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v10i4.2957

Abstract

Automatic speaker recognition may achieve remarkable performance in matched training and test conditions. Conversely, results drop significantly in incompatible noisy conditions. Furthermore, feature extraction significantly affects performance. Mel-frequency cepstral coefficients MFCCs are most commonly used in this field of study. The literature has reported that the conditions for training and testing are highly correlated. Taken together, these facts support strong recommendations for using MFCC features in similar environmental conditions (train/test) for speaker recognition. However, with noise and reverberation present, MFCC performance is not reliable. To address this, we propose a new feature 'entrocy' for accurate and robust speaker recognition, which we mainly employ to support MFCC coefficients in noisy environments. Entrocy is the fourier transform of the entropy, a measure of the fluctuation of the information in sound segments over time. Entrocy features are combined with MFCCs to generate a composite feature set which is tested using the gaussian mixture model (GMM) speaker recognition method. The proposed method shows improved recognition accuracy over a range of signal-to-noise ratios.