Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 83

Full-Text Articles in Entire DC Network

Towards A Reliable Machine Learning-Based Model Designed For Translating Sign Language Videos To Text, Maitha Essa Mohammad Ahli May 2023

Towards A Reliable Machine Learning-Based Model Designed For Translating Sign Language Videos To Text, Maitha Essa Mohammad Ahli

Theses

Communication serves key roles in building relationships through sharing feelings, passing information, and connecting with others. Communication among the hearing impaired remains a significant stumbling block in today’s society since their communication means demands for an interpreter each moment. Various researchers agree that successful communication calls for the involvement of all individuals in a conversation and thus, deaf and hearing-impaired people require precise and welcoming communication to promote their working and learning relationships. Sign Language Recognition (SLR) is a critical and auspicious approach to promoting communication among hearing-impaired people. Sign languages greatly benefit from Machine Learning based translation techniques since …


Look At The Bigger Picture: Analyzing Eye Tracking Data With Multi-Dimensional Visualization, Anjali Jogeshwar Mar 2023

Look At The Bigger Picture: Analyzing Eye Tracking Data With Multi-Dimensional Visualization, Anjali Jogeshwar

Theses

Eye tracking has proven to be a valuable tool for a number of disciplines such as imaging science, vision science, psychology, and neurology. They have used eye-tracking to study the human visual system and to understand how humans attend to their environment for different tasks. Analyzing the gaze data obtained from eye-trackers begins with visualizing the gaze point. By monitoring gaze we can answer questions including, ‘where was the person looking?’ and ‘what image region or object was the person looking at?’ Traditional methods aimed at answering such questions from gaze data: required frame-by-frame analysis of videos, manually annotating gaze. …


From Fully-Supervised Single-Task To Semi-Supervised Multi-Task Deep Learning Architectures For Segmentation In Medical Imaging Applications, S M Kamrul Hasan Jan 2023

From Fully-Supervised Single-Task To Semi-Supervised Multi-Task Deep Learning Architectures For Segmentation In Medical Imaging Applications, S M Kamrul Hasan

Theses

Medical imaging is routinely performed in clinics worldwide for the diagnosis and treatment of numerous medical conditions in children and adults. With the advent of these medical imaging modalities, radiologists can visualize both the structure of the body as well as the tissues within the body. However, analyzing these high-dimensional (2D/3D/4D) images demands a significant amount of time and effort from radiologists. Hence, there is an ever-growing need for medical image computing tools to extract relevant information from the image data to help radiologists perform efficiently. Image analysis based on machine learning has pivotal potential to improve the entire medical …


Normalization And Generalization In Deep Learning, Griffin Hurt Jan 2023

Normalization And Generalization In Deep Learning, Griffin Hurt

Theses

In this thesis, we discuss the importance of data normalization in deep learning and its relationship with generalization. Normalization is a staple of deep learning architectures and has been shown to improve the stability and generalizability of deep learning models, yet the reason why these normalization techniques work is still unknown and is an active area of research. Inspired by this uncertainty, we explore how different normalization techniques perform when employed in different deep learning architectures, while also exploring generalization and metrics associated with generalization in congruence with our investigation into normalization. The goal behind our experiments was to investigate …


Infomixup : An Intuitive And Information-Driven Approach To Robust Generalization, Andrew H. Meyer Aug 2022

Infomixup : An Intuitive And Information-Driven Approach To Robust Generalization, Andrew H. Meyer

Theses

The discovery of Adversarial Examples — data points which are easily recognized by humans, but which fool artificial classifiers with ease, is relatively new in the world of machine learning. Corruptions imperceptible to the human eye are often sufficient to fool state of the art classifiers. The resolution of this problem has been the subject of a great deal of research in recent years as the prevalence of Deep Neural Networks grows in everyday systems. To this end, we propose InfoMixup , a novel method to improve the robustness of Deep Neural Networks without significantly affecting performance on clean samples. …


Learning Representations In The Hyperspectral Domain In Aerial Imagery, Aneesh Rangnekar Aug 2022

Learning Representations In The Hyperspectral Domain In Aerial Imagery, Aneesh Rangnekar

Theses

We establish two new datasets with baselines and network architectures for the task of hyperspectral image analysis. The first dataset, AeroRIT, is a moving camera static scene captured from a flight and contains per pixel labeling across five categories for the task of semantic segmentation. The second dataset, RooftopHSI, helps design and interpret learnt features on hyperspectral object detection on scenes captured from an university rooftop. This dataset accounts for static camera, moving scene hyperspectral imagery. We further broaden the scope of our understanding of neural networks with the development of two novel algorithms - S4AL and S4AL+. We develop …


Deeprm: Deep Recurrent Matching For 6d Pose Refinement, Alexander Avery May 2022

Deeprm: Deep Recurrent Matching For 6d Pose Refinement, Alexander Avery

Theses

Precise 6D pose estimation of rigid objects from RGB images is a critical but challenging task in robotics and augmented reality. To address this problem, we propose DeepRM, a novel recurrent network architecture for 6D pose refinement. DeepRM leverages initial coarse pose estimates to render synthetic images of target objects. The rendered images are then matched with the observed images to predict a rigid transform for updating the previous pose estimate. This process is repeated to incrementally refine the estimate at each iteration. LSTM units are used to propagate information through each refinement step, significantly improving overall performance. In contrast …


A Depth-Based Computer Vision Approach To Unmanned Aircraft System Landing With Optimal Positioning, Nicholas Quattrociocchi Apr 2022

A Depth-Based Computer Vision Approach To Unmanned Aircraft System Landing With Optimal Positioning, Nicholas Quattrociocchi

Theses

High traffic congestion in cities can lead to difficulties in delivering appropriate aid to people in need of emergency services. Developing an autonomous aerial medical evacuation system with the required size to facilitate the need can allow for the mitigation of the constraint. The aerial system must be capable of vertical takeoff and landing to reach highly conjected areas and areas where traditional aircraft cannot access. In general, the most challenging limitation within any proposed solution is the landing sequence. There have been several techniques developed over the years to land aircraft autonomously; however, very little attention has been scoped …


Deep Feature Learning And Adaptation For Computer Vision, Abu Md Niamul Taufique Apr 2022

Deep Feature Learning And Adaptation For Computer Vision, Abu Md Niamul Taufique

Theses

We are living in times when a revolution of deep learning is taking place. In general, deep learning models have a backbone that extracts features from the input data followed by task-specific layers, e.g. for classification. This dissertation proposes various deep feature extraction and adaptation methods to improve task-specific learning, such as visual re-identification, tracking, and domain adaptation. The vehicle re-identification (VRID) task requires identifying a given vehicle among a set of vehicles under variations in viewpoint, illumination, partial occlusion, and background clutter. We propose a novel local graph aggregation module for feature extraction to improve VRID performance. We also …


Learning Multi-Step Robotic Manipulation Tasks Through Visual Planning, Sulabh Kumra Apr 2022

Learning Multi-Step Robotic Manipulation Tasks Through Visual Planning, Sulabh Kumra

Theses

Multi-step manipulation tasks in unstructured environments are extremely challenging for a robot to learn. Such tasks interlace high-level reasoning that consists of the expected states that can be attained to achieve an overall task and low-level reasoning that decides what actions will yield these states. A model-free deep reinforcement learning method is proposed to learn multi-step manipulation tasks. This work introduces a novel Generative Residual Convolutional Neural Network (GR-ConvNet) model that can generate robust antipodal grasps from n-channel image input at real-time speeds (20ms). The proposed model architecture achieved a state-of-the-art accuracy on three standard grasping datasets. The adaptability of …


Towards Efficient Lifelong Machine Learning In Deep Neural Networks, Tyler L. Hayes Mar 2022

Towards Efficient Lifelong Machine Learning In Deep Neural Networks, Tyler L. Hayes

Theses

Humans continually learn and adapt to new knowledge and environments throughout their lifetimes. Rarely does learning new information cause humans to catastrophically forget previous knowledge. While deep neural networks (DNNs) now rival human performance on several supervised machine perception tasks, when updated on changing data distributions, they catastrophically forget previous knowledge. Enabling DNNs to learn new information over time opens the door for new applications such as self-driving cars that adapt to seasonal changes or smartphones that adapt to changing user preferences. In this dissertation, we propose new methods and experimental paradigms for efficiently training continual DNNs without forgetting. We …


Multi-Scale Architectures For Human Pose Estimation, Bruno Artacho Mar 2022

Multi-Scale Architectures For Human Pose Estimation, Bruno Artacho

Theses

In this dissertation we present multiple state-of-the-art deep learning methods for computer vision tasks using multi-scale approaches for two main tasks: pose estimation and semantic segmentation. For pose estimation, we introduce a complete framework expanding the fields-of-view of the network through a multi-scale approach, resulting in a significant increasing the effectiveness of conventional backbone architectures, for several pose estimation tasks without requiring a larger network or postprocessing. Our multi-scale pose estimation framework contributes to research on methods for single-person pose estimation in both 2D and 3D scenarios, pose estimation in videos, and the estimation of multiple people’s pose in a …


Reducing Catastrophic Forgetting In Self-Organizing Maps, Hitesh Ulhas Mangala Vaidya Nov 2021

Reducing Catastrophic Forgetting In Self-Organizing Maps, Hitesh Ulhas Mangala Vaidya

Theses

An agent that is capable of continual or lifelong learning is able to continuously learn from potentially infinite streams of pattern sensory data. One major historic difficulty in building agents capable of such learning is that neural systems struggle to retain previously-acquired knowledge when learning from new data samples. This problem is known as catastrophic forgetting and remains an unsolved problem in the domain of machine learning to this day. To overcome catastrophic forgetting, different approaches have been proposed. One major line of thought advocates the use of memory buffers to store data where the stored data is then used …


Handypose And Vehipose: Pose Estimation Of Flexible And Rigid Objects, Divyansh Gupta Nov 2021

Handypose And Vehipose: Pose Estimation Of Flexible And Rigid Objects, Divyansh Gupta

Theses

Pose estimation is an important and challenging task in computer vision. Hand pose estimation has drawn increasing attention during the past decade and has been utilized in a wide range of applications including augmented reality, virtual reality, human-computer interaction, and action recognition. Hand pose is more challenging than general human body pose estimation due to the large number of degrees of freedom and the frequent occlusions of joints. To address these challenges, we propose HandyPose, a single-pass, end-to-end trainable architecture for hand pose estimation. Adopting an encoder-decoder framework with multi-level features, our method achieves high accuracy in hand pose while …


Gourmetnet: Food Segmentation Using Multi-Scale Waterfall Features With Spatial And Channel Attention, Udit Sharma Nov 2021

Gourmetnet: Food Segmentation Using Multi-Scale Waterfall Features With Spatial And Channel Attention, Udit Sharma

Theses

Deep learning and Computer vision are extensively used to solve problems in wide range of domains from automotive and manufacturing to healthcare and surveillance. Research in deep learning for food images is mainly limited to food identification and detection. Food segmentation is an important problem as the first step for nutrition monitoring, food volume and calorie estimation. This research is intended to expand the horizons of deep learning and semantic segmentation by proposing a novel single-pass, end-to-end trainable network for food segmentation. Our novel architecture incorporates both channel attention and spatial attention information in an expanded multi-scale feature representation using …


Semantic Segmentation And Change Detection In Satellite Imagery, Raaga Madappa Oct 2021

Semantic Segmentation And Change Detection In Satellite Imagery, Raaga Madappa

Theses

Processing of satellite images using deep learning and computer vision methods is needed for urban planning, crop assessments, disaster management, and rescue and recovery operations. Deep learning methods which are trained on ground-based imagery do not translate well to satellite imagery. In this thesis, we focus on the tasks of semantic segmentation and change detection in satellite imagery. A segmentation framework is presented based on existing waterfall-based modules. The proposed framework, called PyramidWASP, or PyWASP for short, can be used with two modules. PyWASP with the Waterfall Atrous Spatial Pooling (WASP) module investigates the effects of adding a feature pyramid …


Real-Time Uav Pose Estimation And Tracking Using Fpga Accelerated April Tag, Ethan Tola Jul 2021

Real-Time Uav Pose Estimation And Tracking Using Fpga Accelerated April Tag, Ethan Tola

Theses

April Tags and other passive fiducial markers are widely used to determine localization using a monocular camera. It utilizes specialized algorithms that detect markers to calculate their orientation and distance in three dimensional (3-D) space. The video and image processing steps performed to use these fiducial systems dominate the computation time of the algorithms. Low latency is a key component for the real-time application of these fiducial markers. The drawbacks of performing the video and image processing in software is the difficulty in performing the same operation in parallel effectively. Specialized hardware instantiations with the same algorithm scan efficiently parallelize …


Domain Generalization And Adaptation With Generative Modeling And Representation Learning, Jaideep Vitthal Murkute Jul 2021

Domain Generalization And Adaptation With Generative Modeling And Representation Learning, Jaideep Vitthal Murkute

Theses

Despite the success of deep learning methods on object recognition tasks, one of the challenges deep learning systems face in the real world is the ability to perform well on the visually different data samples i.e. under a distribution shift caused by the samples of the same object category but from the significantly different visual domain. Many approaches have been proposed in both of these settings, however, not many other works focus on the generative modeling in this context, neither focus on studying the structure of hidden representations learned by the deep learning models. We hypothesize that learning the generative …


Rm-Net: Rasterizing Markov Signals To Images For Deep Learning, Kajal Gupta May 2021

Rm-Net: Rasterizing Markov Signals To Images For Deep Learning, Kajal Gupta

Theses

Statistical machine learning approaches are quite famous for processing Markov signal data. They can model unobserved states and learn certain characteristics particular to a signal with good accuracy. However, with the advent of Deep learning the novice ways of solving a problem has shifted towards this more sophisticated algorithm, which is much better, powerful and more accurate. Specifically, Convolutional Neural Nets (CNN) have shown many promising results on images and videos. Here we illustrate how CNN can be applied to a 1D numeric signal using signal rasterization technique. We start by rasterizing a 1D numeric Markov signal into an image …


Robust Multiple Object Tracking Using Reid Features And Graph Convolutional Networks, Christian Lusardi May 2021

Robust Multiple Object Tracking Using Reid Features And Graph Convolutional Networks, Christian Lusardi

Theses

Deep Learning allows for great advancements in computer vision research and development. An area that is garnering attention is single object tracking and multi-object tracking. Object tracking continues to progress vastly in terms of detection and building re-identification features, but more effort needs to be dedicated to data association. In this thesis, the goal is to use a graph neural network to combine the information from both the bounding box interaction as well as the appearance feature information in a single association chain. This work is designed to explore the usage of graph neural networks and their message passing abilities …


Development Of A Low-Cost Eye Screening Tool For Early Detection Of Diabetic Retinopathy Using Deep Neural Network, David Mwanguo Mwasikira Apr 2021

Development Of A Low-Cost Eye Screening Tool For Early Detection Of Diabetic Retinopathy Using Deep Neural Network, David Mwanguo Mwasikira

Theses

It has been said that technology used in the lab does not directly transfer to what is done in healthcare. Research on the use of Artificial Intelligence (AI) in the diagnosis of Diabetic Retinopathy (DR) has seen tremendous growth over the last couple of years but it is also true not much of that knowledge has been transferred into practice to benefit patients in need. One reason is that it’s a new frontier with untested technologies and one that is evolving too fast. Also, the Real Healthcare situation can be very complicated presenting itself with numerous challenges starting with strict …


Automatic Pain Assessment Through Facial Expressions, Ilham Seladji Dec 2020

Automatic Pain Assessment Through Facial Expressions, Ilham Seladji

Theses

Pain is a strong symptom of diseases. Being an involuntary unpleasant feeling, it can be considered as a reliable indicator of health issues. Pain has always been expressed verbally, but in some cases, traditional patient self-reporting is not efficient. On one side, there are patients who have neurological disorders and cannot express themselves accurately, as well as patients who suddenly lose consciousness due to an abrupt faintness. On another side, medical staff working in crowded hospitals need to focus on emergencies and would opt for the automation of the task of looking after hospitalized patients during their entire stay, in …


Open Set Classification For Deep Learning In Large-Scale And Continual Learning Models, Ryne Roady Aug 2020

Open Set Classification For Deep Learning In Large-Scale And Continual Learning Models, Ryne Roady

Theses

Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers require the ability to recognize inputs from outside the training set as unknowns and update representations in near real-time to account for novel concepts unknown during offline training. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recognition; however, for convolutional neural networks, there have been two major approaches: 1) inference methods to separate known inputs from unknown inputs and 2) feature space regularization …


Point Completion Networks And Segmentation Of 3d Mesh, Naga Durga Harish Kanamarlapudi Aug 2020

Point Completion Networks And Segmentation Of 3d Mesh, Naga Durga Harish Kanamarlapudi

Theses

Deep learning has made many advancements in fields such as computer vision, natural language processing and speech processing. In autonomous driving, deep learning has made great improvements pertaining to the tasks of lane detection, steering estimation, throttle control, depth estimation, 2D and 3D object detection, object segmentation and object tracking. Understanding the 3D world is necessary for safe end-to-end self-driving. 3D point clouds provide rich 3D information, but processing point clouds is difficult since point clouds are irregular and unordered. Neural point processing methods like GraphCNN and PointNet operate on individual points for accurate classification and segmentation results. Occlusion of …


Context Sensitive Image Denoising And Enhancement Using U-Nets, Sahaj Tushar Gandhi Aug 2020

Context Sensitive Image Denoising And Enhancement Using U-Nets, Sahaj Tushar Gandhi

Theses

Noise in images gets introduced at almost every stage of the camera image signal processing pipeline (ISP). Camera companies provide software that cleans most of the noise added at each stage. Even after noise removal is done by the camera software, different noise patterns with different intensities remain in the image. With advances in deep learning, the algorithms are archi- tectured end-to-end. In the present time, machine learning and deep learning models work as end-to-end systems with a special-purpose feature extraction phase. This thesis focuses on the removal of any residual noise in images as performed during the feature extraction …


Gaze Estimation Based On Multi-View Geometric Neural Networks, Devarth Parikh Jul 2020

Gaze Estimation Based On Multi-View Geometric Neural Networks, Devarth Parikh

Theses

Gaze and head pose estimation can play essential roles in various applications, such as human attention recognition and behavior analysis. Most of the deep neural network-based gaze estimation techniques use supervised regression techniques where features are extracted from eye images by neural networks and regress 3D gaze vectors. I plan to apply the geometric features of the eyes to determine the gaze vectors of observers relying on the concepts of 3D multiple view geometry. We develop an end to-end CNN framework for gaze estimation using 3D geometric constraints under semi-supervised and unsupervised settings and compare the results. We explore the …


Self-Supervision Initialization For Semantic Segmentation Networks, Kenneth Alexopoulos Jun 2020

Self-Supervision Initialization For Semantic Segmentation Networks, Kenneth Alexopoulos

Theses

Convolutional neural networks excel at extracting features from signals. These features are able to be utilized for many downstream tasks. These tasks include object recognition, object detection, depth estimation, pixel level semantic segmentation, and more. These tasks can be used for applications such as autonomous driving where images captured by a camera can be used to give a detailed understanding of the scene. While these models are impressive, they can fail to generalize to new environments. This forces the cumbersome process of collecting images from multifarious environments and annotating them by hand. Annotating thousands or millions of images is both …


Clearing The Clouds: Extracting 3d Information From Amongst The Noise, Alexander Fafard May 2020

Clearing The Clouds: Extracting 3d Information From Amongst The Noise, Alexander Fafard

Theses

Advancements permitting the rapid extraction of 3D point clouds from a variety of imaging modalities across the global landscape have provided a vast collection of high fidelity digital surface models. This has created a situation with unprecedented overabundance of 3D observations which greatly outstrips our current capacity to manage and infer actionable information. While years of research have removed some of the manual analysis burden for many tasks, human analysis is still a cornerstone of 3D scene exploitation. This is especially true for complex tasks which necessitate comprehension of scale, texture and contextual learning. In order to ameliorate the interpretation …


Ar Comic Chat, Dylan Bowald May 2020

Ar Comic Chat, Dylan Bowald

Theses

Live speech transcription and captioning are important for the accessibility of deaf and hard of hearing individuals, especially in situations with no visible ASL translators. If live captioning is available at all, it is typically rendered in the style of closed captions on a display such as a phone screen or TV and away from the real conversation. This can potentially divide the focus of the viewer and detract from the experience. This paper proposes an investigation into an alternative, Augmented Reality driven approach to the display of these captions, using deep neural networks to compute, track and associate deep …


Improving Omnidirectional Camera-Based Robot Localization Through Self-Supervised Learning, Robert Relyea May 2020

Improving Omnidirectional Camera-Based Robot Localization Through Self-Supervised Learning, Robert Relyea

Theses

Autonomous agents in any environment require accurate and reliable position and motion estimation to complete their required tasks. Many different sensor modalities have been utilized for this task such as GPS, ultra-wide band, visual simultaneous localization and mapping (SLAM), and light detection and ranging (LiDAR) SLAM. Many of the traditional positioning systems do not take advantage of the recent advances in the machine learning field. In this work, an omnidirectional camera position estimation system relying primarily on a learned model is presented. The positioning system benefits from the wide field of view provided by an omnidirectional camera. Recent developments in …