Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Electrical and Computer Engineering

Marquette University

Electrical and Computer Engineering Faculty Research and Publications

Speech enhancement

Articles 1 - 5 of 5

Full-Text Articles in Engineering

Speech Enhancement Using Bayesian Estimators Of The Perceptually-Motivated Short-Time Spectral Amplitude (Stsa) With Chi Speech Priors, Marek B. Trawicki, Michael T. Johnson Feb 2014

Speech Enhancement Using Bayesian Estimators Of The Perceptually-Motivated Short-Time Spectral Amplitude (Stsa) With Chi Speech Priors, Marek B. Trawicki, Michael T. Johnson

Electrical and Computer Engineering Faculty Research and Publications

In this paper, the authors propose new perceptually-motivated Weighted Euclidean (WE) and Weighted Cosh (WCOSH) estimators that utilize more appropriate Chi statistical models for the speech prior with Gaussian statistical models for the noise likelihood. Whereas the perceptually-motivated WE and WCOSH cost functions emphasized spectral valleys rather than spectral peaks (formants) and indirectly accounted for auditory masking effects, the incorporation of the Chi distribution statistical models demonstrated distinct improvement over the Rayleigh statistical models for the speech prior. The estimators incorporate both weighting law and shape parameters on the cost functions and distributions. Performance is evaluated in terms of the …


Distributed Multichannel Speech Enhancement Based On Perceptually-Motivated Bayesian Estimators Of The Spectral Amplitude, Marek B. Trawicki, Michael T. Johnson Jun 2013

Distributed Multichannel Speech Enhancement Based On Perceptually-Motivated Bayesian Estimators Of The Spectral Amplitude, Marek B. Trawicki, Michael T. Johnson

Electrical and Computer Engineering Faculty Research and Publications

In this study, the authors propose multichannel weighted Euclidean (WE) and weighted cosh (WCOSH) cost function estimators for speech enhancement in the distributed microphone scenario. The goal of the work is to illustrate the advantages of utilising additional microphones and modified cost functions for improving signal-to-noise ratio (SNR) and segmental SNR (SSNR) along with log-likelihood ratio (LLR) and perceptual evaluation of speech quality (PESQ) objective metrics over the corresponding single-channel baseline estimators. As with their single-channel counterparts, the perceptually-motivated multichannel WE and WCOSH estimators are functions of a weighting law parameter, which influences attention of the noisy spectral amplitude through …


Distributed Multichannel Speech Enhancement With Minimum Mean-Square Error Short-Time Spectral Amplitude, Log-Spectral Amplitude, And Spectral Phase Estimation, Marek B. Trawicki, Michael T. Johnson Feb 2012

Distributed Multichannel Speech Enhancement With Minimum Mean-Square Error Short-Time Spectral Amplitude, Log-Spectral Amplitude, And Spectral Phase Estimation, Marek B. Trawicki, Michael T. Johnson

Electrical and Computer Engineering Faculty Research and Publications

In this paper, the authors present optimal multichannel frequency domain estimators for minimum mean-square error (MMSE) short-time spectral amplitude (STSA), log-spectral amplitude (LSA), and spectral phase estimation in a widely distributed microphone configuration. The estimators utilize Rayleigh and Gaussian statistical models for the speech prior and noise likelihood with a diffuse noise field for the surrounding environment. Based on the Signal-to-Noise Ratio (SNR) and Segmental Signal-to-Noise Ratio (SSNR) along with the Log-Likelihood Ratio (LLR) and Perceptual Evaluation of Speech Quality (PESQ) as objective metrics, the multichannel LSA estimator decreases background noise and speech distortion and increases speech quality compared to …


Minimum Mean-Squared Error Estimation Of Mel-Frequency Cepstral Coefficients Using A Novel Distortion Model, Kevin M. Indrebo, Richard J. Povinelli, Michael T. Johnson Oct 2008

Minimum Mean-Squared Error Estimation Of Mel-Frequency Cepstral Coefficients Using A Novel Distortion Model, Kevin M. Indrebo, Richard J. Povinelli, Michael T. Johnson

Electrical and Computer Engineering Faculty Research and Publications

In this paper, a new method for statistical estimation of Mel-frequency cepstral coefficients (MFCCs) in noisy speech signals is proposed. Previous research has shown that model-based feature domain enhancement of speech signals for use in robust speech recognition can improve recognition accuracy significantly. These methods, which typically work in the log spectral or cepstral domain, must face the high complexity of distortion models caused by the nonlinear interaction of speech and noise in these domains. In this paper, an additive cepstral distortion model (ACDM) is developed, and used with a minimum mean-squared error (MMSE) estimator for recovery of MFCC features …


Speech Signal Enhancement Through Adaptive Wavelet Thresholding, Michael T. Johnson, Xiaolong Yuan, Yao Ren Feb 2007

Speech Signal Enhancement Through Adaptive Wavelet Thresholding, Michael T. Johnson, Xiaolong Yuan, Yao Ren

Electrical and Computer Engineering Faculty Research and Publications

This paper demonstrates the application of the Bionic Wavelet Transform (BWT), an adaptive wavelet transform derived from a non-linear auditory model of the cochlea, to the task of speech signal enhancement. Results, measured objectively by Signal-to-Noise ratio (SNR) and Segmental SNR (SSNR) and subjectively by Mean Opinion Score (MOS), are given for additive white Gaussian noise as well as four different types of realistic noise environments. Enhancement is accomplished through the use of thresholding on the adapted BWT coefficients, and the results are compared to a variety of speech enhancement techniques, including Ephraim Malah filtering, iterative Wiener filtering, and spectral …