Signal Processing | Open Access Articles | Digital Commons Network™

Speaker Recognition In Adverse Conditions, Ananth N. Iyer, Uchechukwu O. Ofoegbu, Robert E. Yantorno, Stanley J. Wenndt Mar 2007

Speaker Recognition In Adverse Conditions, Ananth N. Iyer, Uchechukwu O. Ofoegbu, Robert E. Yantorno, Stanley J. Wenndt

Ananth N Iyer

Recognizing speakers from their voices is a challenging area of research with several practical applications. Presently speaker verification (SV) systems achieve a high level of accuracy under ideal conditions such as, when there is ample data to build speaker models and when speaker verification is performed in the presence of little or no interference. In general, these systems assume that the features extracted from the data follow a particular parametric probability density function (pdf), i.e., Gaussian or a mixture of Gaussians; where a form of the pdf is imposed on the speech data rather than determining the underlying structure of …

Go to article

Speaker Identification Using Usable Speech Concept, Ananth N. Iyer, Brett Y. Smolenski, Robert E. Yantorno, Jashmin K. Shah, Edward J. Cupples, Stanley J. Wenndt Sep 2004

Speaker Identification Using Usable Speech Concept, Ananth N. Iyer, Brett Y. Smolenski, Robert E. Yantorno, Jashmin K. Shah, Edward J. Cupples, Stanley J. Wenndt

Ananth N Iyer

Most signal processing involves processing a signal without concern for the quality or information content of that signal. In speech processing, speech is processed on a frame-by-frame basis, usually only with concern that the frame is either speech or silence. However, knowing how reliable the information is in a frame of speech can be very important and useful. This is where usable speech detection and extraction can play a very important role. The usable speech frames can be defined as frames of speech that contain higher information content compared to unusable frames with reference to a particular application. We have …

Go to article

Robust Speaker Verification With Principal Pitch Components, Robert M. Nickel, Sachin P. Oswal, Ananth N. Iyer Sep 2004

Robust Speaker Verification With Principal Pitch Components, Robert M. Nickel, Sachin P. Oswal, Ananth N. Iyer

Ananth N Iyer

We are presenting a new method that improves the accuracy of text dependent speaker identification systems. The new method exploits a set of novel speech features that is derived from a principal component analysis (PC) of voiced speech segments. The new PC features are only weakly correlated with the corresponding cepstral features. A distance measure that combines both, cepstral and PC pitch features provides a discriminative power that cannot be achieved with cepstral features alone. It is well known that the discriminative power of cepstral features declines if the dimensionality of the feature space is increased beyond its optimal value. …

Go to article

Sequential K-Nn Pattern Recognition For Usable Speech Classification, Jashmin K. Shah, Brett Y. Smolenski, Robert E. Yantorno, Ananth N. Iyer Sep 2004

Sequential K-Nn Pattern Recognition For Usable Speech Classification, Jashmin K. Shah, Brett Y. Smolenski, Robert E. Yantorno, Ananth N. Iyer

Ananth N Iyer

The accuracy of speech processing techniques degrades when operating in a co-channel environment. Co-channel speech occurs when more than one person is talking at the same time. The idea of usable speech segmentation is to identify and extract those portions of co-channel speech that are minimally degraded but still useful for speech processing application such as speaker identification. Usable speech measures are features that are extracted from the co-channel signal to distinguish between usable and unusable speech. In this paper, a new usable speech extraction technique is presented. The new method extracts features recursively and variable length segmentation is performed …

Go to article

Usable Speech Detection Using A Context Dependent Gaussian Mixture Model Classifier, Robert E. Yantorno, Brett Y. Smolenski, Ananth N. Iyer, Jashmin K. Shah May 2004

Usable Speech Detection Using A Context Dependent Gaussian Mixture Model Classifier, Robert E. Yantorno, Brett Y. Smolenski, Ananth N. Iyer, Jashmin K. Shah

Ananth N Iyer

Speech that is corrupted by nonstationary interference, but contains segments that are still usable for applications such as speaker identification or speech recognition, is referred to as "usable" speech. A common example of nonstationary interference occurs when there is more than one person talking at the same time, which is known as co-channel speech. In general the above speech processing applications do not work in co-channel environments; however, they can work on the extracted usable segments. Unfortunately, currently available usable speech measures only detect about 75% of the total available usable speech. The first reason for this high error stems …

Go to article

Structural Usable Speech Measure Using Lpc Residual, Ananth N. Iyer, Melinda Gleiter, Brett Y. Smolenski, Robert E. Yantorno Dec 2003

Structural Usable Speech Measure Using Lpc Residual, Ananth N. Iyer, Melinda Gleiter, Brett Y. Smolenski, Robert E. Yantorno

Ananth N Iyer

In an operational environment speech is degraded by many kinds of interferences. The operation of many speech processing techniques are plagued by such interferences. Usable speech extraction is a novel concept of processing degraded speech data. The idea of usable speech is to identify and extract portions of degraded speech that are considered useful for various speech processing systems. The performance reduction of speaker identification systems under degraded conditions and use of usable speech concept to improve the performance has been demonstrated in previous work. A new usable speech measure, based on the structure of Linear Predictive Coding (LPC) residual …

Go to article

Usable Speech Detection Using Linear Predictive Analysis, Nitya Sundaram, Robert E. Yantorno, Brett Y. Smolenski, Ananth N. Iyer Nov 2003

Usable Speech Detection Using Linear Predictive Analysis, Nitya Sundaram, Robert E. Yantorno, Brett Y. Smolenski, Ananth N. Iyer

Ananth N Iyer

A speech segment is defined as “usable,” if speech, which is corrupted by interfering speech, can still be used for applications like speaker identification. In tactical communications, where there are multiple signals transmitted over the same channel such as telephone or radio transmission, separation of usable speech from speech corrupted by voices of other speakers is desired. This separation is important in making automatic speaker and speech recognition systems more robust. A novel approach towards developing a usable speech measure could be model-based. Using this concept of model-based usable speech detection, the use of Linear Prediction is investigated. The method …

Go to article

Signal Processing Commons^™

Full-Text Articles in Signal Processing

Speaker Recognition In Adverse Conditions, Ananth N. Iyer, Uchechukwu O. Ofoegbu, Robert E. Yantorno, Stanley J. Wenndt

Ananth N Iyer

Speaker Identification Using Usable Speech Concept, Ananth N. Iyer, Brett Y. Smolenski, Robert E. Yantorno, Jashmin K. Shah, Edward J. Cupples, Stanley J. Wenndt

Ananth N Iyer

Robust Speaker Verification With Principal Pitch Components, Robert M. Nickel, Sachin P. Oswal, Ananth N. Iyer

Ananth N Iyer

Sequential K-Nn Pattern Recognition For Usable Speech Classification, Jashmin K. Shah, Brett Y. Smolenski, Robert E. Yantorno, Ananth N. Iyer

Ananth N Iyer

Usable Speech Detection Using A Context Dependent Gaussian Mixture Model Classifier, Robert E. Yantorno, Brett Y. Smolenski, Ananth N. Iyer, Jashmin K. Shah

Ananth N Iyer

Structural Usable Speech Measure Using Lpc Residual, Ananth N. Iyer, Melinda Gleiter, Brett Y. Smolenski, Robert E. Yantorno

Ananth N Iyer

Usable Speech Detection Using Linear Predictive Analysis, Nitya Sundaram, Robert E. Yantorno, Brett Y. Smolenski, Ananth N. Iyer

Ananth N Iyer