Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Agroacoustics (3)
- Fourier transforms (3)
- Speech (3)
- Speech enhancement (3)
- Acoustic noise (2)
-
- Amplitude estimation (2)
- Animals (2)
- Bioacoustics (2)
- Cepstral analysis (2)
- Feature extraction (2)
- Filters (2)
- Frequency (2)
- Gaussian processes (2)
- Least mean squares methods (2)
- Mel-frequency cepstral coefficients (2)
- Parameter estimation (2)
- Speech analysis (2)
- Speech processing (2)
- Speech recognition (2)
- Working environment noise (2)
- A priori SNR (1)
- Acoustic arrays (1)
- Acoustic measurements (1)
- Acoustic modeling (1)
- Acoustic noise measurement (1)
- African elephant vocalizations (1)
- Ambient noise (1)
- Animal vocalization arousal levels (1)
- Audio spectrum filters (1)
- Auditory coding (1)
Articles 1 - 9 of 9
Full-Text Articles in Engineering
Optimal Distributed Microphone Phase Estimation, Marek B. Trawicki, Michael T. Johnson
Optimal Distributed Microphone Phase Estimation, Marek B. Trawicki, Michael T. Johnson
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
This paper presents a minimum mean-square error spectral phase estimator for speech enhancement in the distributed multiple microphone scenario. The estimator uses Gaussian models for both the speech and noise priors under the assumption of a diffuse incoherent noise field representing ambient noise in a widely dispersed microphone configuration. Experiments demonstrate significant benefits of using the optimal multichannel phase estimator as compared to the noisy phase of a reference channel.
Auditory Coding Based Speech Enhancement, Yao Ren, Michael T. Johnson
Auditory Coding Based Speech Enhancement, Yao Ren, Michael T. Johnson
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
This paper demonstrates a speech enhancement system based on an efficient auditory coding approach, coding of time-relative structure using spikes. The spike coding method can more compactly represent the non-stationary characteristics of speech signals than the Fourier transform or wavelet transform. Enhancement is accomplished through the use of MMSE thresholding on the spike code. Experimental results show that compared with the spectral domain logSTSA filter, both the subjective spectrogram evaluation and objective SSNR improvement for the proposed approach is better in suppressing noise in high noise situations, with fewer musical artifacts.P
Perceptually Motivated Wavelet Packet Transform For Bioacoustic Signal Enhancement, Yao Ren, Michael T. Johnson, Jidong Tao
Perceptually Motivated Wavelet Packet Transform For Bioacoustic Signal Enhancement, Yao Ren, Michael T. Johnson, Jidong Tao
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical …
Acoustic Model Adaptation For Ortolan Bunting (Emberiza Hortulana L.) Song-Type Classification, Jidong Tao, Michael T. Johnson, Tomasz S. Osiejuk
Acoustic Model Adaptation For Ortolan Bunting (Emberiza Hortulana L.) Song-Type Classification, Jidong Tao, Michael T. Johnson, Tomasz S. Osiejuk
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
Automatic systems for vocalization classification often require fairly large amounts of data on which to train models. However, animal vocalization data collection and transcription is a difficult and time-consuming task, so that it is expensive to create large data sets. One natural solution to this problem is the use of acoustic adaptation methods. Such methods, common in human speech recognition systems, create initial models trained on speaker independent data, then use small amounts of adaptation data to build individual-specific models. Since, as in human speech, individual vocal variability is a significant source of variation in bioacoustic data, acoustic model adaptation …
An Improved Snr Estimator For Speech Enhancement, Yao Ren, Michael T. Johnson
An Improved Snr Estimator For Speech Enhancement, Yao Ren, Michael T. Johnson
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
In this paper, we propose an MMSE a priori SNR estimator for speech enhancement. This estimator has similar benefits to the well-known decision-directed approach, but does not require an ad-hoc weighting factor to balance the past a priori SNR and current ML SNR estimate with smoothing across frames. Performance is evaluated in terms of estimation error and segmental SNR using the standard logSTSA speech enhancement method. Experimental results show that, in contrast with the decision-directed estimator and ML estimator, the proposed SNR estimator can help enhancement algorithms preserve more weak speech information and efficiently suppress musical noise.
Stress And Emotion Classification Using Jitter And Shimmer Features, Xi Li, Jidong Tao, Michael T. Johnson, Joseph Soltis, Anne Savage, Kirsten Leong, John D. Newman
Stress And Emotion Classification Using Jitter And Shimmer Features, Xi Li, Jidong Tao, Michael T. Johnson, Joseph Soltis, Anne Savage, Kirsten Leong, John D. Newman
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
In this paper, we evaluate the use of appended jitter and shimmer speech features for the classification of human speaking styles and of animal vocalization arousal levels. Jitter and shimmer features are extracted from the fundamental frequency contour and added to baseline spectral features, specifically Mel-frequency cepstral coefficients (MFCCs) for human speech and Greenwood function cepstral coefficients (GFCCs) for animal vocalizations. Hidden Markov models (HMMs) with Gaussian mixture models (GMMs) state distributions are used for classification. The appended jitter and shimmer features result in an increase in classification accuracy for several illustrative datasets, including the SUSAS dataset for human speaking …
Generalized Perceptual Linear Prediction (Gplp) Features For Animal Vocalization Analysis, Patrick J. Clemins, Michael T. Johnson
Generalized Perceptual Linear Prediction (Gplp) Features For Animal Vocalization Analysis, Patrick J. Clemins, Michael T. Johnson
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
A new feature extraction model, generalized perceptual linear prediction (gPLP), is developed to calculate a set of perceptually relevant features for digital signal analysis of animalvocalizations. The gPLP model is a generalized adaptation of the perceptual linear prediction model, popular in human speech processing, which incorporates perceptual information such as frequency warping and equal loudness normalization into the feature extraction process. Since such perceptual information is available for a number of animal species, this new approach integrates that information into a generalized model to extract perceptually relevant features for a particular species. To illustrate, qualitative and quantitative comparisons are made …
Automatic Classification Of African Elephant (Loxodonta Africana) Follicular And Luteal Rumbles, Michael T. Johnson, Patrick J. Clemins
Automatic Classification Of African Elephant (Loxodonta Africana) Follicular And Luteal Rumbles, Michael T. Johnson, Patrick J. Clemins
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
Recent research in African elephant vocalizations has shown that there is evidence for acoustic differences in the rumbles of females based on the phase of their estrous cycle (1). One reason for these differences might be to attract a male for reproductive purposes. Since rumbles have a fundamental frequency near 10Hz, they attenuate slowly and can be heard over a distance of several kilometers. This research exploits differences in the rumbles to create an automatic classification system that can determine whether a female rumble was made during the luteal or follicular phase of the ovulatory cycle. This system could be …
Application Of Speech Recognition To African Elephant (Loxodonta Africana) Vocalizations, Patrick J. Clemins, Michael T. Johnson
Application Of Speech Recognition To African Elephant (Loxodonta Africana) Vocalizations, Patrick J. Clemins, Michael T. Johnson
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
This paper presents a novel application of speech processing research, classification of African elephant vocalizations. Speaker identification and call classification experiments are performed on data collected from captive African elephants in a naturalistic environment. The features used for classification are 12 mel-frequency cepstral coefficients plus log energy computed using a shifted filter bank to emphasize the infrasound range of the frequency spectrum used by African elephants. Initial classification accuracies of 83.8% for call classification and 88.1% for speaker identification were obtained. The long-term goal of this research is to develop a universal analysis framework and robust feature set for animal …