Armein Z.R. Langi
Research Center on Information and Communication Technology Information Technology RG, School of Electrical Engineering and Informatics Institut Teknologi Bandung, Jalan Ganeca 10, Bandung, 40116, Indonesia

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Application of Wavelet LPC Excitation Model for Speech Compression Armein Z.R. Langi
Journal of Engineering and Technological Sciences Vol. 40 No. 1 (2008)
Publisher : Institute for Research and Community Services, Institut Teknologi Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.eng.sci.2008.40.1.1

Abstract

This paper presents an application of linear predictive coding (LPC) excitation wavelet models for low bit- rate, high-quality speech compression. The compression scheme exploits the model properties, especially magnitude dependent sensitivity, scale dependent sensitivity, and limited frame length. We use the wavelet model in an open-loop dither based codebook scheme. With t his approach, the compression yields a signal-to-noise ratio of at least 11 dB at rates of 5 kbit/s and.
An LPC Excitation Model using Wavelets Armein Z.R. Langi
Journal of Engineering and Technological Sciences Vol. 40 No. 2 (2008)
Publisher : Institute for Research and Community Services, Institut Teknologi Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.eng.sci.2008.40.2.1

Abstract

This  paper  presents  a  new  model  of  linear  predictive  coding  (LPC) excitation  using  wavelets  for  speech  signals.   The  LPC  excitation   becomes  a linear combination of a set of self-  similar, orthonormal, band-pass signals with time localization and constant bandwidth in a logarithmic scale. Thus, the set of the  coefficients  in  the  linear  combination  represents  the  LPC  excitation.  The discrete  wavelet  transform  (DWT)  obtains  the  coefficients,  having  several asymmetrical  and  non-uniform  distribution  properties  that  are  attractive  for speech processing and compression. The properties include magnitude dependent sensitivity, scale dependent sensitivity, and limited frame length, which can be used  for  having  low  bit-rate  speech.  We  show  that  eliminating  8.97%  highest magnitude  coefficients  degrades  speech  quality  down  to  1.49dB  SNR,  while eliminating  27.51%  lowest  magnitude  coefficient  maintain  speech  quality  at  a level of 27.42 dB SNR. Furthermore eliminating 6.25% coefficients located at a scale associated with 175-630 Hz band severely degrades speech quality down to 4.20 dB SNR. Finally, our results show that optimal frame length for telephony applications is among 32, 64, or 128 samples.