Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Artificial Intelligence and Robotics

PDF

Theses/Dissertations

Computer Vision

Institution
Publication Year
Publication

Articles 1 - 30 of 43

Full-Text Articles in Entire DC Network

Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang Jun 2024

Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang

Dissertations, Theses, and Capstone Projects

Contextual information has been widely used in many computer vision tasks, such as object detection, video action detection, image classification, etc. Recognizing a single object or action out of context could be sometimes very challenging, and context information may help improve the understanding of a scene or an event greatly. However, existing approaches design specific contextual information mechanisms for different detection tasks.

In this research, we first present a comprehensive survey of context understanding in computer vision, with a taxonomy to describe context in different types and levels. Then we proposed MultiCLU, a new multi-stage context learning and utilization framework, …


Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani Feb 2024

Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani

Dissertations, Theses, and Capstone Projects

The understanding of human actions in videos holds immense potential for technological advancement and societal betterment. This thesis explores fundamental aspects of this field, including action recognition in trimmed clips and action localization in untrimmed videos. Trimmed videos contain only one action instance, with moments before or after the action excluded from the video. However, the majority of videos captured in unconstrained environments, often referred to as untrimmed videos, are naturally unsegmented. Untrimmed videos are typically lengthy and may encompass multiple action instances, along with the moments preceding or following each action, as well as transitions between actions. In the …


Using Pose Estimation Software To Predict Actions In Sabre Fencing, Micah Edwin Peters Ii Jan 2024

Using Pose Estimation Software To Predict Actions In Sabre Fencing, Micah Edwin Peters Ii

Honors College Theses

Fencing is a combat sport that uses three different swords: epee, foil, and sabre. Due to its fast-paced nature and employment of right of way, sabre fencing is often considered the most difficult of the three to learn. Computer vision and pose estimation software can be used to lower the barrier of entry to sabre fencing by identifying the different actions in sabre fencing. This project focuses on using open-source software to design a program that can identify the sabre parries as well as the main sabre movements. This program could be used to help newer fencers and spectators better …


Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron Dec 2023

Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron

Doctoral Dissertations

This work introduces improvements to the stability and generalizability of Cyclic DARTS (CDARTS). CDARTS is a Differentiable Architecture Search (DARTS)-based approach to neural architecture search (NAS) that uses a cyclic feedback mechanism to train search and evaluation networks concurrently, thereby optimizing the search process by enforcing that the networks produce similar outputs. However, the dissimilarity between the loss functions used by the evaluation networks during the search and retraining phases results in a search-phase evaluation network, a sub-optimal proxy for the final evaluation network utilized during retraining. ICDARTS, a revised algorithm that reformulates the search phase loss functions to ensure …


Deep Learning For Photovoltaic Characterization, Adrian Manuel De Luis Garcia Dec 2023

Deep Learning For Photovoltaic Characterization, Adrian Manuel De Luis Garcia

Graduate Theses and Dissertations

This thesis introduces a novel approach to Photovoltaic (PV) installation segmentation by proposing a new architecture to understand and identify PV modules from overhead imagery. Pivotal to this concept is the creation of a new Transformer-based network, S3Former, which focuses on small object characterization and modelling intra- and inter- object differentiation inside an image. Accurate mapping of PV installations is pivotal for understanding their adoption and guiding energy policy decisions. Drawing insights from current Deep Learning methodologies for image segmentation and building upon State-of-the-Art (SOTA) techniques in solar cell mapping, this work puts forth S3Former with the following enhancements: 1. …


Smartphone Based Object Detection For Shark Spotting, Darrick W. Oliver Nov 2023

Smartphone Based Object Detection For Shark Spotting, Darrick W. Oliver

Master's Theses

Given concern over shark attacks in coastal regions, the recent use of unmanned aerial vehicles (UAVs), or drones, has increased to ensure the safety of beachgoers. However, much of city officials' process remains manual, with drone operation and review of footage still playing a significant role. In pursuit of a more automated solution, researchers have turned to the usage of neural networks to perform detection of sharks and other marine life. For on-device solutions, this has historically required assembling individual hardware components to form an embedded system to utilize the machine learning model. This means that the camera, neural processing …


Ai Applications On Planetary Rovers, Alexis David Pascual Mar 2023

Ai Applications On Planetary Rovers, Alexis David Pascual

Electronic Thesis and Dissertation Repository

The rise in the number of robotic missions to space is paving the way for the use of artificial intelligence and machine learning in the autonomy and augmentation of rover operations. For one, more rovers mean more images, and more images mean more data bandwidth required for downlinking as well as more mental bandwidth for analyzing the images. On the other hand, light-weight, low-powered microrover platforms are being developed to accommodate the drive for planetary exploration. As a result of the mass and power constraints, these microrover platforms will not carry typical navigational instruments like a stereocamera or a laser …


Reducing Negative Transfer Of Random Data In Source-Free Unsupervised Domain Adaptation, Anthony Wong Mar 2023

Reducing Negative Transfer Of Random Data In Source-Free Unsupervised Domain Adaptation, Anthony Wong

Electronic Thesis and Dissertation Repository

In domain adaptation, a model trained on one dataset (source domain) is applied to a different but related dataset (target domain). The most cutting-edge method is unsupervised source-free domain adaptation (SFDA), in which source data, source labels, and target labels are not available during adaptation. This thesis explores a realistic scenario where the target dataset includes some images that are unrelated to the adaptation process. This scenario can occur from errors in data collection or processing. We provide experiments and analysis to show that current state-of-the-art (SOTA) SFDA methods suffer significant performance drops under a specific domain adaptation setup when …


Region Detection & Segmentation Of Nissl-Stained Rat Brain Tissue, Alexandro Arnal Dec 2022

Region Detection & Segmentation Of Nissl-Stained Rat Brain Tissue, Alexandro Arnal

Open Access Theses & Dissertations

People who analyze images of biological tissue rely on the segmentation of structures as a preliminary step. In particular, laboratories studying the rat brain delineate brain regions to position scientific findings on a brain atlas to propose hypotheses about the rat brain and, ultimately, the human brain. Our work intersects with the preliminary step of delineating regions in images of brain tissue via computational methods.

We investigate pixel-wise classification or segmentation of brain regions using ten histological images of brain tissue sections stained for Nissl substance. We present a deep learning approach that uses the fully convolutional neural network, U-Net, …


Performance Enhancement Of Hyperspectral Semantic Segmentation Leveraging Ensemble Networks, Nicholas Soucy Dec 2022

Performance Enhancement Of Hyperspectral Semantic Segmentation Leveraging Ensemble Networks, Nicholas Soucy

Electronic Theses and Dissertations

Hyperspectral image (HSI) semantic segmentation is a growing field within computer vision, machine learning, and forestry. Due to the separate nature of these communities, research applying deep learning techniques to ground-type semantic segmentation needs improvement, along with working to bring the research and expectations of these three communities together. Semantic segmentation consists of classifying individual pixels within the image based on the features present. Many issues need to be resolved in HSI semantic segmentation including data preprocessing, feature reduction, semantic segmentation techniques, and adversarial training. In this thesis, we tackle these challenges by employing ensemble methods for HSI semantic segmentation. …


Impact Of Movements On Facial Expression Recognition, Zhebin Yin Jun 2022

Impact Of Movements On Facial Expression Recognition, Zhebin Yin

Honors Theses

The ability to recognize human emotions can be a useful skill for robots. Emotion recognition can help robots understand our responses to robot movements and actions. Human emotions can be recognized through facial expressions. Facial Expression Recognition (FER) is a well-established research area, how- ever, the majority of prior research is based on static datasets of images. With robots often the subject is moving, the robot is moving, or both. The purpose of this research is to determine the impact of movement on facial expression recognition. We apply a pre-existing model for FER, which performs around 70.86% on a given …


Gesture Recognition Using Neural Networks, Ashwini Kurady Jan 2022

Gesture Recognition Using Neural Networks, Ashwini Kurady

Master's Projects

The advances in technology have brought in a lot of changes in the way humans go about their lives. This has enhanced the significance of Artificial Neural Networks and Computer Vision- based interactions with the world. Gesture Recognition is one of the major focus areas in Computer Vision. This involves Human Computer Interfaces (HCI) that would capture and understand human actions. In this project, we will explore how Neural Network concepts can be applied in this challenging field of Computer Vision. By leveraging the latest research for Gesture Recognition, we researched on how to capture the movement across different frames …


License Plate Image Quality Enhancement Utilizing Super Resolution Generative Adversarial Networks, Mark Moelter Jan 2022

License Plate Image Quality Enhancement Utilizing Super Resolution Generative Adversarial Networks, Mark Moelter

Electronic Theses and Dissertations

This thesis focuses primarily on enhancing the image quality of blurred license plates through the use of Super-Resolution Generative Adversarial Networks (SRGANs) [1]. We propose a synthetic dataset with SRGAN model to promote blurred image quality enhancement, and allow for model evaluation on a multitude of image input and output size combinations. SRGAN is mainly used for low-resolution image enhancement, but by heavily blurring the input images, the model is tested on its ability to blindly deblur and upsample images to the desired super-resolution (SR) size. The model enhances the image quality to nearly that of the reference images. The …


Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar Dec 2021

Task Classification During Visual Search Using Classic Machine Learning And Deep Learning, Devangi Vilas Chinchankar

Master's Projects

In an average human life, the eyes not only passively scan visual scenes, but most times end up actively performing tasks including, but not limited to, searching, comparing, and counting. As a result of the advances in technology, we are observing a boost in the average screen time. Humans are now looking at an increasing number of screens and in turn images and videos. Understanding what scene a user is looking at and what type of visual task is being performed can be useful in developing intelligent user interfaces, and in virtual reality and augmented reality devices. In this research, …


Analysis Of Camera Trap Footage Through Subject Recognition, Nirnayak Bhardwaj Dec 2021

Analysis Of Camera Trap Footage Through Subject Recognition, Nirnayak Bhardwaj

Master's Projects

Motion-sensitive cameras, otherwise known as camera traps, have become increasingly popular amongst ecologists for studying wildlife. These cameras allow scientists to remotely observe animals through an inexpensive and non-invasive approach. Due to the lenient nature of motion cameras, studies involving them often generate excessive amounts of footage with many photographs not containing any animal subjects. Thus, there is a need for a system that is capable of analyzing camera trap footage to determine if a picture holds value for researchers. While research into automated image recognition is well documented, it has had limited applications in the field of ecology. This …


Multi-Modal Data Fusion, Image Segmentation, And Object Identification Using Unsupervised Machine Learning: Conception, Validation, Applications, And A Basis For Multi-Modal Object Detection And Tracking, Nicholas Lahaye Aug 2021

Multi-Modal Data Fusion, Image Segmentation, And Object Identification Using Unsupervised Machine Learning: Conception, Validation, Applications, And A Basis For Multi-Modal Object Detection And Tracking, Nicholas Lahaye

Computational and Data Sciences (PhD) Dissertations

Remote sensing and instrumentation is constantly improving and increasing in capability. Included within this, is the increase in amount of different instrument types, with various combinations of spatial and spectral resolutions, pointing angles, and various other instrument-specific qualities. While the increase in instruments, and therefore datasets, is a boon for those aiming to study the complexities of the various Earth systems, it can also present a large number of new challenges. With this information in mind, our group has set our aims on combining datasets with different spatial and spectral resolutions in an effective and as-general-as-possible way, with as little …


Forecasting Pedestrian Trajectory Using Deep Learning, Arsal Syed Aug 2021

Forecasting Pedestrian Trajectory Using Deep Learning, Arsal Syed

UNLV Theses, Dissertations, Professional Papers, and Capstones

In this dissertation we develop different methods for forecasting pedestrian trajectories. Complete understanding of pedestrian motion is essential for autonomous agents and social robots to make realistic and safe decisions. Current trajectory prediction methods rely on incorporating historic motion, scene features and social interaction to model pedestrian behaviors. Our focus is to accurately understand scene semantics to better forecast trajectories. In order to do so, we leverage semantic segmentation to encode static scene features such as walkable paths, entry/exits, static obstacles etc. We further evaluate the effectiveness of using semantic maps on different datasets and compare its performance with already …


Take The Lead: Toward A Virtual Video Dance Partner, Ty Farris Aug 2021

Take The Lead: Toward A Virtual Video Dance Partner, Ty Farris

Master's Theses

My work focuses on taking a single person as input and predicting the intentional movement of one dance partner based on the other dance partner's movement. Human pose estimation has been applied to dance and computer vision, but many existing applications focus on a single individual or multiple individuals performing. Currently there are very few works that focus specifically on dance couples combined with pose prediction. This thesis is applicable to the entertainment and gaming industry by training people to dance with a virtual dance partner.

Many existing interactive or virtual dance partners require a motion capture system, multiple cameras …


Signal Processing And Data Analysis For Real-Time Intermodal Freight Classification Through A Multimodal Sensor System., Enrique J. Sanchez Headley Jul 2021

Signal Processing And Data Analysis For Real-Time Intermodal Freight Classification Through A Multimodal Sensor System., Enrique J. Sanchez Headley

Graduate Theses and Dissertations

Identifying freight patterns in transit is a common need among commercial and municipal entities. For example, the allocation of resources among Departments of Transportation is often predicated on an understanding of freight patterns along major highways. There exist multiple sensor systems to detect and count vehicles at areas of interest. Many of these sensors are limited in their ability to detect more specific features of vehicles in traffic or are unable to perform well in adverse weather conditions. Despite this limitation, to date there is little comparative analysis among Laser Imaging and Detection and Ranging (LIDAR) sensors for freight detection …


Perceptually Improved Medical Image Translations Using Conditional Generative Adversarial Networks, Anurag Vaidya Jan 2021

Perceptually Improved Medical Image Translations Using Conditional Generative Adversarial Networks, Anurag Vaidya

Honors Theses

Magnetic resonance imaging (MRI) can help visualize various brain regions. Typical MRI sequences consist of T1-weighted sequence (favorable for observing large brain structures), T2-weighted sequence (useful for pathology), and T2-FLAIR scan (useful for pathology with suppression of signal from water). While these different scans provide complementary information, acquiring them leads to acquisition times of ~1 hour and an average cost of $2,600, presenting significant barriers. To reduce these costs associated with brain MRIs, we present pTransGAN, a generative adversarial network capable of translating both healthy and unhealthy T1 scans into T2 scans. We show that the addition of non-adversarial …


Attentional Parsing Networks, Marcus Karr Dec 2020

Attentional Parsing Networks, Marcus Karr

Master's Theses

Convolutional neural networks (CNNs) have dominated the computer vision field since the early 2010s, when deep learning largely replaced previous approaches like hand-crafted feature engineering and hierarchical image parsing. Meanwhile transformer architectures have attained preeminence in natural language processing, and have even begun to supplant CNNs as the state of the art for some computer vision tasks.

This study proposes a novel transformer-based architecture, the attentional parsing network, that reconciles the deep learning and hierarchical image parsing approaches to computer vision. We recast unsupervised image representation as a sequence-to-sequence translation problem where image patches are mapped to successive layers …


Dataset And Evaluation Of Self-Supervised Learning For Panoramic Depth Estimation, Ryan Nett Dec 2020

Dataset And Evaluation Of Self-Supervised Learning For Panoramic Depth Estimation, Ryan Nett

Master's Theses

Depth detection is a very common computer vision problem. It shows up primarily in robotics, automation, or 3D visualization domains, as it is essential for converting images to point clouds. One of the poster child applications is self driving cars. Currently, the best methods for depth detection are either very expensive, like LIDAR, or require precise calibration, like stereo cameras. These costs have given rise to attempts to detect depth from a monocular camera (a single camera). While this is possible, it is harder than LIDAR or stereo methods since depth can't be measured from monocular images, it has to …


Attacking Computer Vision Models Using Occlusion Analysis To Create Physically Robust Adversarial Images, Jacobsen Loh Jun 2020

Attacking Computer Vision Models Using Occlusion Analysis To Create Physically Robust Adversarial Images, Jacobsen Loh

Master's Theses

Self-driving cars rely on their sense of sight to function effectively in chaotic and uncontrolled environments. Thanks to recent developments in computer vision, specifically convolutional neural networks, autonomous vehicles have developed the ability to see at or above human-level capabilities, which in turn has allowed for rapid advances in self-driving cars. Unfortunately, much like humans being confused by simple optical illusions, convolutional neural networks are susceptible to simple adversarial inputs. As there is no overlap between the optical illusions that fool humans and the adversarial examples that threaten convolutional neural networks, little is understood as to why these adversarial examples …


Detection Of Mild Cognitive Impairment Using Diffusion Compartment Imaging, Matthew Jones May 2020

Detection Of Mild Cognitive Impairment Using Diffusion Compartment Imaging, Matthew Jones

Master's Projects

The result of applying the Neurite Orientation Density and Dispersion Index (NODDI) algorithm to improve the prediction accuracy for patients diagnosed with MCI is reported. Calculations were carried out using a collection of 68 patients (34 control and 34 with MCI) gathered from the Alzheimer’s Disease Neuroimaging Initiative database (ADNI). Patient data includes the use of high-resolution Magnetic Resonance Images as with as Diffusion Tensor Imaging. A Linear Regression accuracy of 83% was observed using the added NODDI summary statistic: Orientation Dispersion Index (ODI). A statistically significant difference in groups was found between control patients and patients with MCI with …


Estimating Free-Flow Speed With Lidar And Overhead Imagery, Armin Hadzic Jan 2020

Estimating Free-Flow Speed With Lidar And Overhead Imagery, Armin Hadzic

Theses and Dissertations--Computer Science

Understanding free-flow speed is fundamental to transportation engineering in order to improve traffic flow, control, and planning. The free-flow speed of a road segment is the average speed of automobiles unaffected by traffic congestion or delay. Collecting speed data across a state is both expensive and time consuming. Some approaches have been presented to estimate speed using geometric road features for certain types of roads in limited environments. However, estimating speed at state scale for varying landscapes, environments, and road qualities has been relegated to manual engineering and expensive sensor networks. This thesis proposes an automated approach for estimating free-flow …


Representation Learning With Adversarial Latent Autoencoders, Stanislav Pidhorskyi M.S. Jan 2020

Representation Learning With Adversarial Latent Autoencoders, Stanislav Pidhorskyi M.S.

Graduate Theses, Dissertations, and Problem Reports

A large number of deep learning methods applied to computer vision problems require encoder-decoder maps. These methods include, but are not limited to, self-representation learning, generalization, few-shot learning, and novelty detection. Encoder-decoder maps are also useful for photo manipulation, photo editing, superresolution, etc. Encoder-decoder maps are typically learned using autoencoder networks.
Traditionally, autoencoder reciprocity is achieved in the image-space using pixel-wise
similarity loss, which has a widely known flaw of producing non-realistic reconstructions. This flaw is typical for the Variational Autoencoder (VAE) family and is not only limited to pixel-wise similarity losses, but is common to all methods relying upon …


A Study Of Face Embedding In Face Recognition, Khanh Duc Le Mar 2019

A Study Of Face Embedding In Face Recognition, Khanh Duc Le

Master's Theses

Face Recognition has been a long-standing topic in computer vision and pattern recognition field because of its wide and important applications in our daily lives such as surveillance system, access control, and so on. The current modern face recognition model, which keeps only a couple of images per person in the database, can now recognize a face with high accuracy. Moreover, the model does not need to be retrained every time a new person is added to the database.

By using the face dataset from Digital Democracy, the thesis will explore the capability of this model by comparing it with …


Exploring Cyber-Physical Systems, Misbah Uddin Mohammed Jan 2019

Exploring Cyber-Physical Systems, Misbah Uddin Mohammed

Graduate Research Theses & Dissertations

The advances in IOT, Computer Vision, AI and Machine Learning have made these technologies ubiquitous to our daily lives. From Smart Phones to Connected Vehicles, Cyber Physical systems have been interspersed into everything we interact in today’s world. The aim or this thesis was to explore these advances in Cyber Physical Systems and analyze the different sectors they were affecting. We then hand-picked certain domains and explored further by carrying out practical projects using some of the latest software and hardware resources available. Technologies like Amazon Alexa services, NVIDIA Jetson boards, TensorFlow, OpenCV, NodeJS were heavily employed in our various …


Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie Nov 2018

Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie

Master of Science in Computer Science Theses

The evolution of machine learning and computer vision in technology has driven a lot of

improvements and innovation into several domains. We see it being applied for credit decisions, insurance quotes, malware detection, fraud detection, email composition, and any other area having enough information to allow the machine to learn patterns. Over the years the number of sensors, cameras, and cognitive pieces of equipment placed in the wilderness has been growing exponentially. However, the resources (human) to leverage these data into something meaningful are not improving at the same rate. For instance, a team of scientist volunteers took 8.4 years, …


Integration Of Robotic Perception, Action, And Memory, Li Yang Ku Oct 2018

Integration Of Robotic Perception, Action, And Memory, Li Yang Ku

Doctoral Dissertations

In the book "On Intelligence", Hawkins states that intelligence should be measured by the capacity to memorize and predict patterns. I further suggest that the ability to predict action consequences based on perception and memory is essential for robots to demonstrate intelligent behaviors in unstructured environments. However, traditional approaches generally represent action and perception separately---as computer vision modules that recognize objects and as planners that execute actions based on labels and poses. I propose here a more integrated approach where action and perception are combined in a memory model, in which a sequence of actions can be planned based on …