Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
Articles 1 - 12 of 12
Full-Text Articles in Physical Sciences and Mathematics
Spoken Language Processing And Modeling For Aviation Communications, Aaron Van De Brook
Spoken Language Processing And Modeling For Aviation Communications, Aaron Van De Brook
Doctoral Dissertations and Master's Theses
With recent advances in machine learning and deep learning technologies and the creation of larger aviation-specific corpora, applying natural language processing technologies, especially those based on transformer neural networks, to aviation communications is becoming increasingly feasible. Previous work has focused on machine learning applications to natural language processing, such as N-grams and word lattices. This thesis experiments with a process for pretraining transformer-based language models on aviation English corpora and compare the effectiveness and performance of language models transfer learned from pretrained checkpoints and those trained from their base weight initializations (trained from scratch). The results suggest that transformer language …
Interactive Emirate Sign Language E-Dictionary Based On Deep Learning Recognition Models, Ahmed Abdelhadi Abdelhadi
Interactive Emirate Sign Language E-Dictionary Based On Deep Learning Recognition Models, Ahmed Abdelhadi Abdelhadi
Theses
According to the ministry of community development database in the United Arab Emirates (UAE) about 3065 people with disabilities are hearing disabled (Emirates News Agency - Ministry of Community Development). Hearing-impaired people find it difficult to communicate with the rest of society. They usually need Sign Language (SL) interpreters but as the number of hearing-impaired individuals grows the number of Sign Language interpreters can almost be non-existent. In addition, specialized schools lack a unified Sign Language (SL) dictionary, which can be linked to the Arabic language being of a diglossia nature, hence many dialects of the language co-exist. Moreover, there …
Data-Centric Machine Learning For Speech And Audio, Ali Raza Syed
Data-Centric Machine Learning For Speech And Audio, Ali Raza Syed
Dissertations, Theses, and Capstone Projects
There is growing recognition of the importance of data-centric methods for building machine learning systems. Data-centric methods assume a fixed model and iterate over the data to improve system performance. This is in contrast to traditional model-centric approaches, which assume a fixed dataset and iterate over models for the same ends. Data-centric machine learning is driven by the observation that, beyond the size of the training data, model performance depends on factors such as the quality of the annotations, and whether the data are representative of conditions in which models will be deployed. This is particularly of interest in the …
Adversarial Attacks On Speech Separation Systems, Kendrick Trinh
Adversarial Attacks On Speech Separation Systems, Kendrick Trinh
Master's Projects
Speech separation is a special form of blind source separation in which the objective is to decouple two or more sources such that they are distinct. The need for such an ability grows as speech activated device usage increases in our every day life. These systems, however, are susceptible to malicious actors. In this work, we repurpose proven adversarial attacks and leverage them against a combination speech separation and speech recognition system. The attack adds adversarial noise to a mixture of two voices such that the two outputs of the speech separation system are similarly transcribed by the speech recognition …
Understanding Model Reasoning In Automated Speech Systems: Implementing A Prototype Explanation System Using The Lime Method, Vadim Kudlay
Understanding Model Reasoning In Automated Speech Systems: Implementing A Prototype Explanation System Using The Lime Method, Vadim Kudlay
Honors Theses
The field of voice processing has seen great advancements thanks in part to the rise of deep learning. However, the application of these deep learning techniques with an audio input space leads to an interesting result not commonly found when dealing with other input domains. Namely, common techniques for generating auditory adversarial samples using gradient-based optimization have been observed to have extremely low transferability among even the same model structure. This implies an inherent difference in the latent representations of audio samples that may be worth investigating in the pursuit of a more resilient and interpretable voice processing framework. Our …
Human-Ai Teaming For Dynamic Interpersonal Skill Training, Xavian Alexander Ogletree
Human-Ai Teaming For Dynamic Interpersonal Skill Training, Xavian Alexander Ogletree
Browse all Theses and Dissertations
In almost every field, there is a need for strong interpersonal skills. This is especially true in fields such as medicine, psychology, and education. For instance, healthcare providers need to show understanding and compassion for LGBTQ+ and BIPOC (Black, Indigenous, and People of Color), or individuals with unique developmental or mental health needs. Improving interpersonal skills often requires first-person experience with expert evaluation and guidance to achieve proficiency. However, due to limited availability of assessment capabilities, professional standardized patients and instructional experts, students and professionals currently have inadequate opportunities for expert-guided training sessions. Therefore, this research aims to demonstrate leveraging …
Speaker Recognition By Hidden Markov Models And Neural Networks, Erik J. Zeek
Speaker Recognition By Hidden Markov Models And Neural Networks, Erik J. Zeek
Theses and Dissertations
As humans, we develop the ability to identify people by their voice at an early age. Getting computers to perform the same task has proven to be an interesting problem. Speaker recognition involves two applications, speaker identification and speaker verification. Both applications are examined in this effort. Two methods are employed to perform speaker recognition. The first is an enhancement of hidden Markov models. Rather than alter some part of the model itself, a single-layer perceptron is added to perform neural post-processing. The second solution is the novel application of an enhanced Feature Space Trajectory Neural Network to speaker recognition. …
Generalized Hidden Filter Markov Models Applied To Speaker Recognition, John M. Colombi
Generalized Hidden Filter Markov Models Applied To Speaker Recognition, John M. Colombi
Theses and Dissertations
Classification of time series has wide Air Force, DoD and commercial interest, from automatic target recognition systems on munitions to recognition of speakers in diverse environments. The ability to effectively model the temporal information contained in a sequence is of paramount importance. Toward this goal, this research develops theoretical extensions to a class of stochastic models and demonstrates their effectiveness on the problem of text-independent (language constrained) speaker recognition. Specifically within the hidden Markov model architecture, additional constraints are implemented which better incorporate observation correlations and context, where standard approaches fail. Two methods of modeling correlations are developed, and their …
A Comparative Evaluation Of Voice Versus Keypad Input For Manipulating Electronic Technical Data For Flight Line Maintenance Technicians, David A. Chapman, James R. Simmons
A Comparative Evaluation Of Voice Versus Keypad Input For Manipulating Electronic Technical Data For Flight Line Maintenance Technicians, David A. Chapman, James R. Simmons
Theses and Dissertations
Interactive Electronic Technical Manuals will soon become a requirement for aircraft maintenance technicians. An important aspect in their development is the selection of an input device that will enhance, rather than impede, technician performance. The purpose of this thesis was to evaluate two types of input devices that can he used: a voice recognition input and a keypad input. Studies to date have evaluated the superiority of digital data over paper data, and advantages of using a Head Mounted Display Device over a flat screen laptop computer. No research has evaluated the input device. An experiment was conducted to determine …
Dyadic Wavelet Features For Isolated Word Speaker Dependent Speech Recognition, Stephen Ainge
Dyadic Wavelet Features For Isolated Word Speaker Dependent Speech Recognition, Stephen Ainge
Theses and Dissertations
This research examines the use of dyadic wavelet features for the recognition of speaker dependent isolated word speech. The features were generated using three different wavelet filters-Daubechies 4 coefficient (Db4), Daubechies 20 coefficient (Db20) and a 31 coefficient cubic spline and three different window lengths-15ms, 8ms and 4ms. The accuracy of the standard and over-sampled dyadic wavelet methods were compared. The over-sampled dyadic wavelet method using the Db4 scaling function, with a maximum accuracy of 65.5, was found to be the most accurate of the wavelet methods tested. The accuracy of this over-sampled dyadic Db4 wavelet method was compared to …
Clustering Techniques In Speaker Recognition, Douglas N. Prescott
Clustering Techniques In Speaker Recognition, Douglas N. Prescott
Theses and Dissertations
This thesis presents a comparison based on identification rate, of three clustering techniques applied to cepstral features for speaker identification. LBG vector quantization as developed by Linde, Buzo and Gray; is used to provide benchmark performance for comparison with Fuzzy clustering (based on the unsupervised fuzzy partition-optimal number of classes, UFP-ONC algorithm by Gath and Geva) and an Artificial Neural Network, the Multilayer Perceptron. Cepstral features from the TIMIT, King and AFIT93 corpus speaker databases are used to produce speaker-identification classifiers using each of the clustering algorithms. The experiment reported evaluates the speaker identification performance using the 20-dimensional cepstral features …
Identity Verification Through The Fusion Of Face And Speaker Recognition, John G. Keller
Identity Verification Through The Fusion Of Face And Speaker Recognition, John G. Keller
Theses and Dissertations
In this research, face recognition and speaker identification systems are each converted into verification systems. The two verification systems are then fused to form a single identity verification system. Finally, the use of the Karhunen-Loeve Transform (KLT) for dimensional reduction is examined for suitability in the verification task. The base face recognition system used the KLT for feature reduction and a back-propagation neural net for classification. Verification involved training a net for each individual in the database for two classes of outputs, 'Joe' or 'not Joe.' The base speaker identification system used Cepstral analysis for feature extraction and a distortion …