Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Physical Sciences and Mathematics

Spoken Language Processing And Modeling For Aviation Communications, Aaron Van De Brook Oct 2023

Spoken Language Processing And Modeling For Aviation Communications, Aaron Van De Brook

Doctoral Dissertations and Master's Theses

With recent advances in machine learning and deep learning technologies and the creation of larger aviation-specific corpora, applying natural language processing technologies, especially those based on transformer neural networks, to aviation communications is becoming increasingly feasible. Previous work has focused on machine learning applications to natural language processing, such as N-grams and word lattices. This thesis experiments with a process for pretraining transformer-based language models on aviation English corpora and compare the effectiveness and performance of language models transfer learned from pretrained checkpoints and those trained from their base weight initializations (trained from scratch). The results suggest that transformer language …


Interactive Emirate Sign Language E-Dictionary Based On Deep Learning Recognition Models, Ahmed Abdelhadi Abdelhadi Apr 2023

Interactive Emirate Sign Language E-Dictionary Based On Deep Learning Recognition Models, Ahmed Abdelhadi Abdelhadi

Theses

According to the ministry of community development database in the United Arab Emirates (UAE) about 3065 people with disabilities are hearing disabled (Emirates News Agency - Ministry of Community Development). Hearing-impaired people find it difficult to communicate with the rest of society. They usually need Sign Language (SL) interpreters but as the number of hearing-impaired individuals grows the number of Sign Language interpreters can almost be non-existent. In addition, specialized schools lack a unified Sign Language (SL) dictionary, which can be linked to the Arabic language being of a diglossia nature, hence many dialects of the language co-exist. Moreover, there …


Data-Centric Machine Learning For Speech And Audio, Ali Raza Syed Sep 2022

Data-Centric Machine Learning For Speech And Audio, Ali Raza Syed

Dissertations, Theses, and Capstone Projects

There is growing recognition of the importance of data-centric methods for building machine learning systems. Data-centric methods assume a fixed model and iterate over the data to improve system performance. This is in contrast to traditional model-centric approaches, which assume a fixed dataset and iterate over models for the same ends. Data-centric machine learning is driven by the observation that, beyond the size of the training data, model performance depends on factors such as the quality of the annotations, and whether the data are representative of conditions in which models will be deployed. This is particularly of interest in the …


Adversarial Attacks On Speech Separation Systems, Kendrick Trinh Jan 2022

Adversarial Attacks On Speech Separation Systems, Kendrick Trinh

Master's Projects

Speech separation is a special form of blind source separation in which the objective is to decouple two or more sources such that they are distinct. The need for such an ability grows as speech activated device usage increases in our every day life. These systems, however, are susceptible to malicious actors. In this work, we repurpose proven adversarial attacks and leverage them against a combination speech separation and speech recognition system. The attack adds adversarial noise to a mixture of two voices such that the two outputs of the speech separation system are similarly transcribed by the speech recognition …


Understanding Model Reasoning In Automated Speech Systems: Implementing A Prototype Explanation System Using The Lime Method, Vadim Kudlay Apr 2021

Understanding Model Reasoning In Automated Speech Systems: Implementing A Prototype Explanation System Using The Lime Method, Vadim Kudlay

Honors Theses

The field of voice processing has seen great advancements thanks in part to the rise of deep learning. However, the application of these deep learning techniques with an audio input space leads to an interesting result not commonly found when dealing with other input domains. Namely, common techniques for generating auditory adversarial samples using gradient-based optimization have been observed to have extremely low transferability among even the same model structure. This implies an inherent difference in the latent representations of audio samples that may be worth investigating in the pursuit of a more resilient and interpretable voice processing framework. Our …


Human-Ai Teaming For Dynamic Interpersonal Skill Training, Xavian Alexander Ogletree Jan 2021

Human-Ai Teaming For Dynamic Interpersonal Skill Training, Xavian Alexander Ogletree

Browse all Theses and Dissertations

In almost every field, there is a need for strong interpersonal skills. This is especially true in fields such as medicine, psychology, and education. For instance, healthcare providers need to show understanding and compassion for LGBTQ+ and BIPOC (Black, Indigenous, and People of Color), or individuals with unique developmental or mental health needs. Improving interpersonal skills often requires first-person experience with expert evaluation and guidance to achieve proficiency. However, due to limited availability of assessment capabilities, professional standardized patients and instructional experts, students and professionals currently have inadequate opportunities for expert-guided training sessions. Therefore, this research aims to demonstrate leveraging …


Speaker Recognition By Hidden Markov Models And Neural Networks, Erik J. Zeek Dec 1996

Speaker Recognition By Hidden Markov Models And Neural Networks, Erik J. Zeek

Theses and Dissertations

As humans, we develop the ability to identify people by their voice at an early age. Getting computers to perform the same task has proven to be an interesting problem. Speaker recognition involves two applications, speaker identification and speaker verification. Both applications are examined in this effort. Two methods are employed to perform speaker recognition. The first is an enhancement of hidden Markov models. Rather than alter some part of the model itself, a single-layer perceptron is added to perform neural post-processing. The second solution is the novel application of an enhanced Feature Space Trajectory Neural Network to speaker recognition. …


Generalized Hidden Filter Markov Models Applied To Speaker Recognition, John M. Colombi Mar 1996

Generalized Hidden Filter Markov Models Applied To Speaker Recognition, John M. Colombi

Theses and Dissertations

Classification of time series has wide Air Force, DoD and commercial interest, from automatic target recognition systems on munitions to recognition of speakers in diverse environments. The ability to effectively model the temporal information contained in a sequence is of paramount importance. Toward this goal, this research develops theoretical extensions to a class of stochastic models and demonstrates their effectiveness on the problem of text-independent (language constrained) speaker recognition. Specifically within the hidden Markov model architecture, additional constraints are implemented which better incorporate observation correlations and context, where standard approaches fail. Two methods of modeling correlations are developed, and their …


A Comparative Evaluation Of Voice Versus Keypad Input For Manipulating Electronic Technical Data For Flight Line Maintenance Technicians, David A. Chapman, James R. Simmons Sep 1995

A Comparative Evaluation Of Voice Versus Keypad Input For Manipulating Electronic Technical Data For Flight Line Maintenance Technicians, David A. Chapman, James R. Simmons

Theses and Dissertations

Interactive Electronic Technical Manuals will soon become a requirement for aircraft maintenance technicians. An important aspect in their development is the selection of an input device that will enhance, rather than impede, technician performance. The purpose of this thesis was to evaluate two types of input devices that can he used: a voice recognition input and a keypad input. Studies to date have evaluated the superiority of digital data over paper data, and advantages of using a Head Mounted Display Device over a flat screen laptop computer. No research has evaluated the input device. An experiment was conducted to determine …


Dyadic Wavelet Features For Isolated Word Speaker Dependent Speech Recognition, Stephen Ainge Mar 1994

Dyadic Wavelet Features For Isolated Word Speaker Dependent Speech Recognition, Stephen Ainge

Theses and Dissertations

This research examines the use of dyadic wavelet features for the recognition of speaker dependent isolated word speech. The features were generated using three different wavelet filters-Daubechies 4 coefficient (Db4), Daubechies 20 coefficient (Db20) and a 31 coefficient cubic spline and three different window lengths-15ms, 8ms and 4ms. The accuracy of the standard and over-sampled dyadic wavelet methods were compared. The over-sampled dyadic wavelet method using the Db4 scaling function, with a maximum accuracy of 65.5, was found to be the most accurate of the wavelet methods tested. The accuracy of this over-sampled dyadic Db4 wavelet method was compared to …


Clustering Techniques In Speaker Recognition, Douglas N. Prescott Mar 1994

Clustering Techniques In Speaker Recognition, Douglas N. Prescott

Theses and Dissertations

This thesis presents a comparison based on identification rate, of three clustering techniques applied to cepstral features for speaker identification. LBG vector quantization as developed by Linde, Buzo and Gray; is used to provide benchmark performance for comparison with Fuzzy clustering (based on the unsupervised fuzzy partition-optimal number of classes, UFP-ONC algorithm by Gath and Geva) and an Artificial Neural Network, the Multilayer Perceptron. Cepstral features from the TIMIT, King and AFIT93 corpus speaker databases are used to produce speaker-identification classifiers using each of the clustering algorithms. The experiment reported evaluates the speaker identification performance using the 20-dimensional cepstral features …


Identity Verification Through The Fusion Of Face And Speaker Recognition, John G. Keller Dec 1993

Identity Verification Through The Fusion Of Face And Speaker Recognition, John G. Keller

Theses and Dissertations

In this research, face recognition and speaker identification systems are each converted into verification systems. The two verification systems are then fused to form a single identity verification system. Finally, the use of the Karhunen-Loeve Transform (KLT) for dimensional reduction is examined for suitability in the verification task. The base face recognition system used the KLT for feature reduction and a back-propagation neural net for classification. Verification involved training a net for each individual in the database for two classes of outputs, 'Joe' or 'not Joe.' The base speaker identification system used Cepstral analysis for feature extraction and a distortion …