Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Computer Vision

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 133

Full-Text Articles in Engineering

High-Fidelity 3d Reconstruction Of Space Bodies Using Machine Learning And Neural Radiance Fields, Timothy Jacob Huber May 2024

High-Fidelity 3d Reconstruction Of Space Bodies Using Machine Learning And Neural Radiance Fields, Timothy Jacob Huber

Theses and Dissertations

In the era of burgeoning space exploration, the growing population of spacecraft heightens the inevitability of collisions. While traditional imagery remains effective for damage assessment, its lack of 3D representation of the object necessitates more advanced approaches. This research delves into cutting-edge methodologies, with a primary focus on leveraging machine learning and computer vision technologies to enhance the precision and efficiency of damage assessment by taking imagery of a space body and creating a 3D model. The study extensively investigates TransMVSNet, a neural network code employing classical computer vision techniques such as multi-vision stereo (MVS) and depth maps. This approach …


Insights Into Cellular Evolution: Temporal Deep Learning Models And Analysis For Cell Image Classification, Xinran Zhao Mar 2024

Insights Into Cellular Evolution: Temporal Deep Learning Models And Analysis For Cell Image Classification, Xinran Zhao

Master's Theses

Understanding the temporal evolution of cells poses a significant challenge in developmental biology. This study embarks on a comparative analysis of various machine-learning techniques to classify cell colony images across different timestamps, thereby aiming to capture dynamic transitions of cellular states. By performing Transfer Learning with state-of-the-art classification networks, we achieve high accuracy in categorizing single-timestamp images. Furthermore, this research introduces the integration of temporal models, notably LSTM (Long Short Term Memory Network), R-Transformer (Recurrent Neural Network enhanced Transformer) and ViViT (Video Vision Transformer), to undertake this classification task to verify the effectiveness of incorporating temporal features into the classification …


Towards Multi-Modal Interpretable Video Understanding, Quang Sang Truong Dec 2023

Towards Multi-Modal Interpretable Video Understanding, Quang Sang Truong

Graduate Theses and Dissertations

This thesis introduces an innovative approach to video comprehension, which simulates human perceptual mechanisms and establishes a comprehensible and coherent narrative representation of video content. At the core of this approach lies the creation of a Visual-Linguistic (VL) feature for an interpretable video portrayal and an adaptive attention mechanism (AAM) aimed at concentrating solely on principal actors or pertinent objects while modeling their interconnections. Taking cues from the way humans disassemble scenes into visual and non-visual constituents, the proposed VL feature characterizes a scene via three distinct modalities: (i) a global visual environment, providing a broad contextual comprehension of the …


Smartphone Based Object Detection For Shark Spotting, Darrick W. Oliver Nov 2023

Smartphone Based Object Detection For Shark Spotting, Darrick W. Oliver

Master's Theses

Given concern over shark attacks in coastal regions, the recent use of unmanned aerial vehicles (UAVs), or drones, has increased to ensure the safety of beachgoers. However, much of city officials' process remains manual, with drone operation and review of footage still playing a significant role. In pursuit of a more automated solution, researchers have turned to the usage of neural networks to perform detection of sharks and other marine life. For on-device solutions, this has historically required assembling individual hardware components to form an embedded system to utilize the machine learning model. This means that the camera, neural processing …


Hardware-In-The-Loop Reaction Wheel Testbed With Camera Vision, Abigail Romero, Harvey Perkins, Stephen Kwok-Choon Oct 2023

Hardware-In-The-Loop Reaction Wheel Testbed With Camera Vision, Abigail Romero, Harvey Perkins, Stephen Kwok-Choon

College of Engineering Summer Undergraduate Research Program

Reaction wheels are widely used in aerospace systems as a method of attitude control. This research was focused on the design, development, and testing of a hardware-in-the-loop reaction wheel testbed that can be used for research and teaching applications related to satellite navigation and control. This project successfully utilized commercial off-the-shelf components to develop a reaction wheel capable of controlling the orientation of a freely rotating platform, as well as tracking objects using computer vision.


Advanced Traffic Video Analytics For Robust Traffic Accident Detection, Hadi Ghahremannezhad Aug 2023

Advanced Traffic Video Analytics For Robust Traffic Accident Detection, Hadi Ghahremannezhad

Dissertations

Automatic traffic accident detection is an important task in traffic video analysis due to its key applications in developing intelligent transportation systems. Reducing the time delay between the occurrence of an accident and the dispatch of the first responders to the scene may help lower the mortality rate and save lives. Since 1980, many approaches have been presented for the automatic detection of incidents in traffic videos. In this dissertation, some challenging problems for accident detection in traffic videos are discussed and a new framework is presented in order to automatically detect single-vehicle and intersection traffic accidents in real-time.

First, …


Enhancing Human Key Point Identification: A Comparative Study Of High-Resolution Vicon Dataset And Coco Dataset Using Bpnet, Bibash Lama Aug 2023

Enhancing Human Key Point Identification: A Comparative Study Of High-Resolution Vicon Dataset And Coco Dataset Using Bpnet, Bibash Lama

Masters Theses

Accurately identifying human key points is crucial for various applications, including activity recognition, pose estimation, and gait analysis. This study presents a high-resolution dataset created using the VICON motion capture system and three differently oriented 2D cameras, that can be used to train different neural networks for estimating the 2D key joint positions of the person from the 2D images or videos. The participants in the study included 25 healthy adults (17 males and 8 females) performing normal gait movements for about 2 to 3 seconds. The VICON system captured 3D ground truth data, while the three 2D cameras collected …


Insect Classification And Explainability From Image Data Via Deep Learning Techniques, Tanvir Hossain Bhuiyan Jun 2023

Insect Classification And Explainability From Image Data Via Deep Learning Techniques, Tanvir Hossain Bhuiyan

USF Tampa Graduate Theses and Dissertations

Since the dawn of the Industrial Revolution, humanity has always tried to make labor more efficient and automated, and this trend is only continuing in the modern digital age. With the advent of artificial intelligence (AI) techniques in the latter part of the 20th century, the speed and scale with which AI has been leveraged to automate tasks defy human imagination. Many people deeply entrenched in the technology field are genuinely intrigued and concerned about how AI may change many of the ways in which humans have been living for millennia. Only time will provide the answers. This dissertation is …


Experimental Characterization And Computer Vision-Assisted Detection Of Pitting Corrosion On Stainless Steel Structural Members, Riley J. Muehler Jun 2023

Experimental Characterization And Computer Vision-Assisted Detection Of Pitting Corrosion On Stainless Steel Structural Members, Riley J. Muehler

Master's Theses

Pitting corrosion is a prevalent form of corrosive damage that can weaken, damage, and initiate failure in corrosion-resistant metallic materials. For instance, 304 stainless steel is commonly utilized in various structures (e.g., miter gates, heat exchangers, and storage tanks), but is prone to failure through pitting corrosion and stress corrosion cracking under mechanical loading, regardless of its high corrosion resistance. In this study, to better understand the pitting corrosion damage development, controlled corrosion experiments were conducted to generate pits on 304 stainless steel specimens with and without mechanical loading. The pit development over time was characterized using a high-resolution laser …


Biologically Inspired Multi-Robot System Based On Wolf Hunting Behavior, Zachary Hinnen, Chance Hamilton, Alfredo Weitzenfeld May 2023

Biologically Inspired Multi-Robot System Based On Wolf Hunting Behavior, Zachary Hinnen, Chance Hamilton, Alfredo Weitzenfeld

36th Florida Conference on Recent Advances in Robotics

Studies involving the group predator behavior of wolves have inspired multiple robotic architectures to mimic these biological behaviors in their designs and research. In this work, we aim to use robotic systems to mimic wolf packs' single and group behavior. This work aims to extend the original research by Weitzenfeld et al [7] and evaluate under a new multi-robot robot system architecture. The multiple robot architecture includes a 'Prey' pursued by a wolf pack consisting of an 'Alpha' and 'Beta' robotic group. The Alpha Wolf' will be the group leader, searching and tracking the 'Prey.' At the same time, the …


A Human-In-The-Loop Robot Grasping System With Grasp Quality Refinement, Tian Tan Mar 2023

A Human-In-The-Loop Robot Grasping System With Grasp Quality Refinement, Tian Tan

USF Tampa Graduate Theses and Dissertations

The goal of this dissertation is to develop a grasping system for assistive robots that can help people with disabilities and the elderly to perform tasks of daily living. In developing this robot grasping system, we maximize its reliability, accuracy, and autonomy. High reliability and accuracy are required for robots to perform tasks around human users and to safely interact with objects that might be fragile or have contents that could spill. High autonomy is desired as users with disabilities are usually not dexterous enough to directly operate the robot. In this dissertation, a human-in-the-loop (HitL) robot grasping system is …


Ai Applications On Planetary Rovers, Alexis David Pascual Mar 2023

Ai Applications On Planetary Rovers, Alexis David Pascual

Electronic Thesis and Dissertation Repository

The rise in the number of robotic missions to space is paving the way for the use of artificial intelligence and machine learning in the autonomy and augmentation of rover operations. For one, more rovers mean more images, and more images mean more data bandwidth required for downlinking as well as more mental bandwidth for analyzing the images. On the other hand, light-weight, low-powered microrover platforms are being developed to accommodate the drive for planetary exploration. As a result of the mass and power constraints, these microrover platforms will not carry typical navigational instruments like a stereocamera or a laser …


An Adaptive Multiple-Object Tracking Architecture For Long-Duration Videos With Variable Target Density, Joachim Lohn-Jaramillo Jan 2023

An Adaptive Multiple-Object Tracking Architecture For Long-Duration Videos With Variable Target Density, Joachim Lohn-Jaramillo

Dartmouth College Ph.D Dissertations

Multiple-Object Tracking (MOT) methods are used to detect targets in individual video frames, e.g., vehicles, people, and other objects, and then record each unique target’s path over time. Current state-of-the-art approaches are extremely complex because most rely on extracting and comparing visual features at every frame to track each object. These approaches are geared toward high-difficulty-tracking scenarios, e.g., crowded airports, and require expensive dedicated hardware, e.g., Graphics Processing Units. In hardware-constrained applications, researchers are turning to older, less complex MOT methods, which reveals a serious scalability issue within the state-of-the-art. Crowded environments are a niche application for MOT, i.e., there …


Hard-Hearted Scrolls: A Noninvasive Method For Reading The Herculaneum Papyri, Stephen Parsons Jan 2023

Hard-Hearted Scrolls: A Noninvasive Method For Reading The Herculaneum Papyri, Stephen Parsons

Theses and Dissertations--Computer Science

The Herculaneum scrolls were buried and carbonized by the eruption of Mount Vesuvius in A.D. 79 and represent the only classical library discovered in situ. Charred by the heat of the eruption, the scrolls are extremely fragile. Since their discovery two centuries ago, some scrolls have been physically opened, leading to some textual recovery but also widespread damage. Many other scrolls remain in rolled form, with unknown contents. More recently, various noninvasive methods have been attempted to reveal the hidden contents of these scrolls using advanced imaging. Unfortunately, their complex internal structure and lack of clear ink contrast has prevented …


Lung Cancer Type Classification, Mohit Ramajibhai Ankoliya Dec 2022

Lung Cancer Type Classification, Mohit Ramajibhai Ankoliya

Electronic Theses, Projects, and Dissertations

Lung cancer is the third most common cancer in the U.S. This research focuses on classifying lung cancer cells based on their tumor cell, shape, and biological traits in images automatically obtained by passing through the

convolutional layers. Additionally, I classify whether the lung cell is adenocarcinoma, large cell carcinoma, squamous cell carcinoma, or normal cell carcinoma. The benefit of this classification is an accurate prognosis, leading to patients receiving proper therapy. The Lung Cancer CT(Computed Tomography) image dataset from Kaggle has been drawn with 1000 CT images of various types of lung cancer. Two state-of-the-art convolutional neural networks (CNNs) …


Cocm: Co-Occurrence-Based Consistency Matching In Domain-Adaptive Segmentation, Siyu Zhu, Yingjie Tian, Fenfen Zhou, Kunlong Bai, Xiaoyu Song Nov 2022

Cocm: Co-Occurrence-Based Consistency Matching In Domain-Adaptive Segmentation, Siyu Zhu, Yingjie Tian, Fenfen Zhou, Kunlong Bai, Xiaoyu Song

Electrical and Computer Engineering Faculty Publications and Presentations

This paper focuses on domain adaptation in a semantic segmentation task. Traditional methods regard the source domain and the target domain as a whole, and the image matching is determined by random seeds, leading to a low degree of consistency matching between domains and interfering with the reduction in the domain gap. Therefore, we designed a two-step, three-level cascaded domain consistency matching strategy—co-occurrence-based consistency matching (COCM)—in which the two steps are: Step 1, in which we design a matching strategy from the perspective of category existence and filter the sub-image set with the highest degree of matching from the image …


Generative Spatio-Temporal And Multimodal Analysis Of Neonatal Pain, Md Sirajus Salekin Nov 2022

Generative Spatio-Temporal And Multimodal Analysis Of Neonatal Pain, Md Sirajus Salekin

USF Tampa Graduate Theses and Dissertations

Neonates can not express their pain like an adult person. Due to the lacking of proper muscle growth and inability to express non-verbally, it is difficult to understand their emotional status. In addition, if the neonates are under any treatment or left monitored after any major surgeries (post-operative), it is more difficult to understand their pain due to the side effect of medications and the caring system (i.e. intubated, masked face, covered body with blanket, etc.). In a clinical environment, usually, bedside nurses routinely observe the neonate and measure the pain status following any standard clinical pain scale. But current …


Iris Detection Authenticator, Nathan D. Tang, Bryan K. Chau Nov 2022

Iris Detection Authenticator, Nathan D. Tang, Bryan K. Chau

Electrical Engineering

The development of iris biometric identification recognition is presented. Iris recognition differs from other methods because data acquisition is non-physical and is more accessible. It has been proven that the iris does not change as an individual ages and is well protected from external damages due to the eyelid and cornea, acting as a shield to the iris. In addition, the iris is almost impossible to forge due to its complex patterns and the current limitations in technology. Using Canny Edge Detection, Hough Transform, rubber-sheet normalization, Histogram of Gradient feature extraction, and the MultiMedia University iris database as our subjects, …


Foreign Object Debris Detection For Airport Pavement Images Based On Self-Supervised Localization And Vision Transformer, Travis Munyer, Daniel Brinkman, Xin Zhong, Chenyu Huang, Iason Konstantzos Oct 2022

Foreign Object Debris Detection For Airport Pavement Images Based On Self-Supervised Localization And Vision Transformer, Travis Munyer, Daniel Brinkman, Xin Zhong, Chenyu Huang, Iason Konstantzos

Durham School of Architectural Engineering and Construction: Faculty Publications

Supervised object detection methods provide subpar performance when applied to Foreign Object Debris (FOD) detection because FOD could be arbitrary objects according to the Federal Aviation Administration (FAA) specification. Current supervised object detection algorithms require datasets that contain annotated examples of every to-be-detected object. While a large and expensive dataset could be developed to include common FOD examples, it is infeasible to collect all possible FOD examples in the dataset representation because of the openended nature of FOD. Limitations of the dataset could cause FOD detection systems driven by those supervised algorithms to miss certain FOD, which can become dangerous …


Softskip: Empowering Multi-Modal Dynamic Pruning For Single-Stage Referring Comprehension, Dulanga Weerakoon, Vigneshwaran Subbaraju, Tuan Tran, Archan Misra Oct 2022

Softskip: Empowering Multi-Modal Dynamic Pruning For Single-Stage Referring Comprehension, Dulanga Weerakoon, Vigneshwaran Subbaraju, Tuan Tran, Archan Misra

Research Collection School Of Computing and Information Systems

Supporting real-time referring expression comprehension (REC) on pervasive devices is an important capability for human-AI collaborative tasks. Model pruning techniques, applied to DNN models, can enable real-time execution even on resource-constrained devices. However, existing pruning strategies are designed principally for uni-modal applications, and suffer a significant loss of accuracy when applied to REC tasks that require fusion of textual and visual inputs. We thus present a multi-modal pruning model, LGMDP, which uses language as a pivot to dynamically and judiciously select the relevant computational blocks that need to be executed. LGMDP also introduces a new SoftSkip mechanism, whereby 'skipped' visual …


Machine Learning Applications In Plant Identification, Wireless Channel Estimation, And Gain Estimation For Multi-User Software-Defined Radio, Viraj K. Gajjar Aug 2022

Machine Learning Applications In Plant Identification, Wireless Channel Estimation, And Gain Estimation For Multi-User Software-Defined Radio, Viraj K. Gajjar

Doctoral Dissertations

"This work applies machine learning (ML) techniques to selected computer vision and digital communication problems. Machine learning algorithms can be trained to perform a specific task without explicit programming. This research applies ML to the problems of: plant identification from images of leaves, channel state information (CSI) estimation for wireless multiple-input-multiple-output (MIMO) systems, and gain estimation for a multi-user software-defined radio (SDR) application.

In the first task, two methods for plant species identification from leaf images are developed. One of the methods uses hand-crafted features extracted from leaf images to train a support vector machine classifier. The other method combines …


Detection Of Rotorcraft Landing Sites: An Ai-Based Approach, Abdullah Nasir Jul 2022

Detection Of Rotorcraft Landing Sites: An Ai-Based Approach, Abdullah Nasir

Theses and Dissertations

The updated information about the location and type of rotorcraft landing sites is an essential asset for the Federal Aviation Administration (FAA) and the Department of Transportation (DOT). However, acquiring, verifying, and regularly updating information about landing sites is not straightforward. The lack of current and correct information about landing sites is a risk factor in several rotorcraft accidents and incidents. The current FAA database of rotorcraft landing sites contains inaccurate and missing entries due to the manual updating process. There is a need for an accurate and automated validation tool to identify landing sites from satellite imagery. This thesis …


Eagle Medical Tray Denesting & Debris Removal Process, Nicholas Allen Ungefug, Noah Chavez, Susana Shu-Lin Okhuysen, Michael Augustine Pennington Jun 2022

Eagle Medical Tray Denesting & Debris Removal Process, Nicholas Allen Ungefug, Noah Chavez, Susana Shu-Lin Okhuysen, Michael Augustine Pennington

Industrial and Manufacturing Engineering

Eagle Medical Incorporated is a contract medical device packaging and sterilization company. The company purchases thermoformed medical packaging trays, which maintain the sterility of medical devices, from various manufacturers. To ensure packaging quality and to prevent cleanroom contamination, Eagle Medical inspects and sterilizes each blister tray that they order. This process is an essential non-value-added activity that creates a bottleneck. Cleanroom employees must stop packaging medical devices and attend to the processing of blister trays and packaging solutions. The blister trays arrive at Eagle’s facility in nested stacks. Vibration and movement during shipping further compresses the stacks, which makes separation …


Automated Segmentation Of The Inner Ear And Round Window In Computed Tomography Scans Using Convolutional Neural Networks, Kyle A. Rioux Apr 2022

Automated Segmentation Of The Inner Ear And Round Window In Computed Tomography Scans Using Convolutional Neural Networks, Kyle A. Rioux

Electronic Thesis and Dissertation Repository

Computed tomography (CT) scans are acquired prior to cochlear implant (CI) surgery. Three-dimensional segmentations of the inner ear (IE) and round window (RW) based on clinical CTs can improve the CI procedure. Software pipelines are presented here which employ convolutional neural networks to automatically segment the IE and RW. The first pipeline produces high resolution segmentations of the IE and RW in tightly cropped CTs. Mean IE Dice score and RW centroid error were 0.88, 0.57mm and 0.93, 0.18mm in implanted and non-implanted samples, respectively. The second pipeline automatically segments the IE in large field of view CTs of any …


A Computer Vision-Based Method For Bolt Loosening Detection, Savannah Burdette Apr 2022

A Computer Vision-Based Method For Bolt Loosening Detection, Savannah Burdette

Honors Theses

Routine bolt-loosening inspection plays an essential role in managing and preventing the degradation of our nation’s highway bridges over time. Neglecting to perform these inspections could result in public safety concerns. The study of this thesis develops a cost-effective method of bolt-loosening detection based on computer vision. To this end, two input images of the bolted connections are collected at two different inspection times. The feature points are then identified from the input images, based on which a geometric transformation matrix is applied to correct any perspective differences between the two images. Next, we select the image patches of the …


Conditional Variational Autoencoder (Cvae) For The Augmentation Of Ecl Biosensor Data, Matthew Dulcich Apr 2022

Conditional Variational Autoencoder (Cvae) For The Augmentation Of Ecl Biosensor Data, Matthew Dulcich

Honors Theses

Machine Learning (ML) is vastly improving the world, from computer vision to fully self-driving cars, we are now able accomplish objectives that were thought to only be dreams. In order to train ML models accurately, they require mountains of information to work with, but sometimes it becomes impossible to collect the data needed, so we turn to data augmentation. In this project we use a conditional variational auto encoder to supplement the original video electrochemiluminescence biosensor dataset, in order to increase the accuracy of a future classification model. In other words, using a cVAE we will create unique realistic videos …


Learning Domain Invariant Information To Enhance Presentation Attack Detection In Visible Face Recognition Systems, Jennifer Hamblin Apr 2022

Learning Domain Invariant Information To Enhance Presentation Attack Detection In Visible Face Recognition Systems, Jennifer Hamblin

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Face signatures, including size, shape, texture, skin tone, eye color, appearance, and scars/marks, are widely used as discriminative, biometric information for access control. Despite recent advancements in facial recognition systems, presentation attacks on facial recognition systems have become increasingly sophisticated. The ability to detect presentation attacks or spoofing attempts is a pressing concern for the integrity, security, and trust of facial recognition systems. Multi-spectral imaging has been previously introduced as a way to improve presentation attack detection by utilizing sensors that are sensitive to different regions of the electromagnetic spectrum (e.g., visible, near infrared, long-wave infrared). Although multi-spectral presentation attack …


Humanoid Robot Motion Control For Ramps And Stairs, Tommy Truong Mar 2022

Humanoid Robot Motion Control For Ramps And Stairs, Tommy Truong

USF Tampa Graduate Theses and Dissertations

Humanoid robot research and development have been an ongoing effort since the 1900sand can be broken down to two problems. A mechanical problem, getting a humanoid robot to move human-like or a software problem, getting a humanoid robot to behave human-like. These problems of moving and behaving human-like can be often solved using control theory as research advances. For the premise of this research, we explore how to balance and walk on non-flat terrain for the humanoid robot Darwin-Op. Since the focus was on the control theory, the vision control to detect the non-flat terrain was a side objective. The …


Camera And Lidar Fusion For Point Cloud Semantic Segmentation, Ali Abdelkader Jan 2022

Camera And Lidar Fusion For Point Cloud Semantic Segmentation, Ali Abdelkader

Theses and Dissertations

Perception is a fundamental component of any autonomous driving system. Semantic segmentation is the perception task of assigning semantic class labels to sensor inputs. While autonomous driving systems are currently equipped with a suite of sensors, much focus in the literature has been on semantic segmentation of camera images only. Research in the fusion of different sensor modalities for semantic segmentation has not been investigated as much. Deep learning models based on transformer architectures have proven successful in many tasks in computer vision and natural language processing. This work explores the use of deep learning transformers to fuse information from …


License Plate Image Quality Enhancement Utilizing Super Resolution Generative Adversarial Networks, Mark Moelter Jan 2022

License Plate Image Quality Enhancement Utilizing Super Resolution Generative Adversarial Networks, Mark Moelter

Electronic Theses and Dissertations

This thesis focuses primarily on enhancing the image quality of blurred license plates through the use of Super-Resolution Generative Adversarial Networks (SRGANs) [1]. We propose a synthetic dataset with SRGAN model to promote blurred image quality enhancement, and allow for model evaluation on a multitude of image input and output size combinations. SRGAN is mainly used for low-resolution image enhancement, but by heavily blurring the input images, the model is tested on its ability to blindly deblur and upsample images to the desired super-resolution (SR) size. The model enhances the image quality to nearly that of the reference images. The …