Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Engineering

Parallel Reference Speaker Weighting For Kinematic-Independent Acoustic-To-Articulatory Inversion, An Ji, Michael T. Johnson, Jeffrey J. Berry Oct 2016

Parallel Reference Speaker Weighting For Kinematic-Independent Acoustic-To-Articulatory Inversion, An Ji, Michael T. Johnson, Jeffrey J. Berry

Speech Pathology and Audiology Faculty Research and Publications

Acoustic-to-articulatory inversion, the estimation of articulatory kinematics from an acoustic waveform, is a challenging but important problem. Accurate estimation of articulatory movements has the potential for significant impact on our understanding of speech production, on our capacity to assess and treat pathologies in a clinical setting, and on speech technologies such as computer aided pronunciation assessment and audio-video synthesis. However, because of the complex and speaker-specific relationship between articulation and acoustics, existing approaches for inversion do not generalize well across speakers. As acquiring speaker-specific kinematic data for training is not feasible in many practical applications, this remains an important and …


Evaluation Of The Importance Of Time-Frequency Contributions To Speech Intelligibility In Noise, Chengzhu Yu, Kamil K. Wójcicki, Philipos C. Loizou, John H. L. Hansen, Michael T. Johnson May 2014

Evaluation Of The Importance Of Time-Frequency Contributions To Speech Intelligibility In Noise, Chengzhu Yu, Kamil K. Wójcicki, Philipos C. Loizou, John H. L. Hansen, Michael T. Johnson

Electrical and Computer Engineering Faculty Research and Publications

Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two …


Tracking Articulator Movements Using Orientation Measurements, An Ji, Michael T. Johnson, Jeffrey J. Berry Jan 2012

Tracking Articulator Movements Using Orientation Measurements, An Ji, Michael T. Johnson, Jeffrey J. Berry

Speech Pathology and Audiology Faculty Research and Publications

This paper introduces a new method to track articulator movements, specifically jaw position and angle, using 5 degree of freedom (5 DOF) orientation data. The approach uses a quaternion rotation method to accomplish this jaw tracking during speech using a single senor on the mandibular incisor. Data were collected using the NDI Wave Speech Research System for one pilot subject with various speech tasks. The degree of jaw rotation from the proposed approach is compared with traditional geometric calculation. Results show that the quaternion based method is able to describe jaw angle trajectory and gives more accurate and smooth estimation …


Perceptually Motivated Wavelet Packet Transform For Bioacoustic Signal Enhancement, Yao Ren, Michael T. Johnson, Jidong Tao Jul 2008

Perceptually Motivated Wavelet Packet Transform For Bioacoustic Signal Enhancement, Yao Ren, Michael T. Johnson, Jidong Tao

Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations

A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical …


Acoustic Model Adaptation For Ortolan Bunting (Emberiza Hortulana L.) Song-Type Classification, Jidong Tao, Michael T. Johnson, Tomasz S. Osiejuk Mar 2008

Acoustic Model Adaptation For Ortolan Bunting (Emberiza Hortulana L.) Song-Type Classification, Jidong Tao, Michael T. Johnson, Tomasz S. Osiejuk

Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations

Automatic systems for vocalization classification often require fairly large amounts of data on which to train models. However, animal vocalization data collection and transcription is a difficult and time-consuming task, so that it is expensive to create large data sets. One natural solution to this problem is the use of acoustic adaptation methods. Such methods, common in human speech recognition systems, create initial models trained on speaker independent data, then use small amounts of adaptation data to build individual-specific models. Since, as in human speech, individual vocal variability is a significant source of variation in bioacoustic data, acoustic model adaptation …


Generalized Perceptual Linear Prediction (Gplp) Features For Animal Vocalization Analysis, Patrick J. Clemins, Michael T. Johnson Jul 2006

Generalized Perceptual Linear Prediction (Gplp) Features For Animal Vocalization Analysis, Patrick J. Clemins, Michael T. Johnson

Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations

A new feature extraction model, generalized perceptual linear prediction (gPLP), is developed to calculate a set of perceptually relevant features for digital signal analysis of animalvocalizations. The gPLP model is a generalized adaptation of the perceptual linear prediction model, popular in human speech processing, which incorporates perceptual information such as frequency warping and equal loudness normalization into the feature extraction process. Since such perceptual information is available for a number of animal species, this new approach integrates that information into a generalized model to extract perceptually relevant features for a particular species. To illustrate, qualitative and quantitative comparisons are made …


Automatic Classification And Speaker Identification Of African Elephant (Loxodonta Africana) Vocalizations, Patrick J. Clemins, Michael T. Johnson, Kirsten Leong, Anne Savage Feb 2005

Automatic Classification And Speaker Identification Of African Elephant (Loxodonta Africana) Vocalizations, Patrick J. Clemins, Michael T. Johnson, Kirsten Leong, Anne Savage

Electrical and Computer Engineering Faculty Research and Publications

A hidden Markov model (HMM) system is presented for automatically classifying African elephant vocalizations. The development of the system is motivated by successful models from human speech analysis and recognition. Classification features include frequency-shifted Mel-frequency cepstral coefficients (MFCCs) and log energy, spectrally motivated features which are commonly used in human speech processing. Experiments, including vocalization type classification and speaker identification, are performed on vocalizations collected from captive elephants in a naturalistic environment. The system classified vocalizations with accuracies of 94.3% and 82.5% for type classification and speaker identification classification experiments, respectively. Classification accuracy, statistical significance tests on the model parameters, …