Engineering | Open Access Articles | Digital Commons Network™

Unsupervised Indexing Of Noisy Conversations With Short Speaker Utterances, Uchechukwu O. Ofoegbu, Ananth N. Iyer, Robert E. Yantorno, Stanley J. Wenndt

Ananth N Iyer

Two speaker indexing system for conversations are presented in this paper. The first method involves indexing two-speaker conversations. In this method, two reference models are judiciously chosen from the conversation such that they represent the two different speakers. Models are then matched to the reference speakers using distance-based comparisons. The second technique is based on first determining the number of participants in the conversation using a speaker count method termed the “Residual Ratio Algorithm” (RRA), and then indexing based on this count. The RRA involves an elimination process in which speech segments matching a chosen set of reference models are …

Go to article

Blind Speaker Clustering, Ananth N. Iyer, Uchechukwu O. Ofoegbu, Robert E. Yantorno, Brett Y. Smolenski

Ananth N Iyer

A novel approach to performing speaker clustering in telephone conversations is presented in this paper. The method is based on a simple observation that the distance between populations of feature vectors extracted from different speakers is greater than a preset threshold. This observation is incorporated into the clustering problem by the formulation of a constrained optimization problem. A modified c-means algorithm is designed to solve the optimization problem. Another key aspect in speaker clustering is to determine the number of clusters, which is either assumed or expected as an input in traditional methods. The proposed method does not require such …

Go to article

Generic Modeling Applied To Speaker Count, Ananth N. Iyer, Uchechukwu O. Ofoegbu, Robert E. Yantorno, Brett Y. Smolenski

Ananth N Iyer

The problem of determing the number of speakers participating in a conversation and building their models in short conversations, within an unknown group of speakers, is addressed in this paper. The lack of information about the number of speakers and the unavailability of sufficient data present a challenging task of efficiently estimating the speaker model parameters. The proposed method uses a novel generic speaker identification (GSID) system as a guide in the model building process. The GSID system is designed performing speaker identification where the speaker associated with the test data may not be enrolled. The models in the GSID …

Go to article

Detection Of A Third Speaker In Telephone Conversations, Uchechukwu O. Ofoegbu, Ananth N. Iyer, Robert E. Yantorno, Stanley J. Wenndt

Ananth N Iyer

Differentiating speakers participating in telephone conversations is a challenging task in speech processing because only short consecutive utterances can be examined for each speaker. Research has shown that, given only brief utterances (1 second or less), humans can recognize speakers with an accuracy of about 54% on average. The task becomes even more challenging when no information about the speakers is known a priori. In this paper, a technique for determining whether there are two or three speakers participating in a telephone conversation is presented. This approach assumes no knowledge or information about any of the participating speakers. The technique …

Go to article

Engineering Commons^™

Full-Text Articles in Engineering

Unsupervised Indexing Of Noisy Conversations With Short Speaker Utterances, Uchechukwu O. Ofoegbu, Ananth N. Iyer, Robert E. Yantorno, Stanley J. Wenndt

Ananth N Iyer

Blind Speaker Clustering, Ananth N. Iyer, Uchechukwu O. Ofoegbu, Robert E. Yantorno, Brett Y. Smolenski

Ananth N Iyer

Generic Modeling Applied To Speaker Count, Ananth N. Iyer, Uchechukwu O. Ofoegbu, Robert E. Yantorno, Brett Y. Smolenski

Ananth N Iyer

Detection Of A Third Speaker In Telephone Conversations, Uchechukwu O. Ofoegbu, Ananth N. Iyer, Robert E. Yantorno, Stanley J. Wenndt

Ananth N Iyer