Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- California Polytechnic State University, San Luis Obispo (11)
- University of Arkansas, Fayetteville (7)
- University of South Florida (7)
- Western University (7)
- University of Nevada, Las Vegas (6)
-
- Missouri University of Science and Technology (5)
- San Jose State University (5)
- Singapore Management University (4)
- Technological University Dublin (4)
- University of Texas at El Paso (4)
- Utah State University (4)
- Wright State University (4)
- Zayed University (3)
- Air Force Institute of Technology (2)
- Bard College (2)
- Brigham Young University (2)
- City University of New York (CUNY) (2)
- Clemson University (2)
- Fordham University (2)
- Georgia Southern University (2)
- Loyola University Chicago (2)
- Selected Works (2)
- The University of Maine (2)
- University of Kentucky (2)
- University of Massachusetts Boston (2)
- University of Wisconsin Milwaukee (2)
- Western Kentucky University (2)
- Bucknell University (1)
- CCT College Dublin (1)
- Chapman University (1)
- Publication Year
- Publication
-
- Master's Theses (8)
- Electronic Thesis and Dissertation Repository (7)
- Theses and Dissertations (7)
- USF Tampa Graduate Theses and Dissertations (7)
- Electronic Theses and Dissertations (6)
-
- Graduate Theses and Dissertations (6)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (6)
- Master's Projects (5)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (4)
- Browse all Theses and Dissertations (4)
- Doctoral Dissertations (4)
- Open Access Theses & Dissertations (4)
- All Works (3)
- Research Collection School Of Computing and Information Systems (3)
- All Theses (2)
- Andrés Corrada-Emmanuel (2)
- Computer Science: Faculty Publications and Other Works (2)
- Conference papers (2)
- Dissertations, Theses, and Capstone Projects (2)
- Faculty Publications (2)
- Graduate Masters Theses (2)
- Honors Theses (2)
- Theses and Dissertations--Computer Science (2)
- Computational and Data Sciences (PhD) Dissertations (1)
- Computer Engineering (1)
- Computer Science Faculty Research & Creative Works (1)
- Computer Science Faculty publications (1)
- Computer Science and Computer Engineering Undergraduate Honors Theses (1)
- Computer Science and Engineering Dissertations (1)
- Computer Science and Software Engineering (1)
- Publication Type
Articles 1 - 30 of 125
Full-Text Articles in Entire DC Network
Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang
Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang
Dissertations, Theses, and Capstone Projects
Contextual information has been widely used in many computer vision tasks, such as object detection, video action detection, image classification, etc. Recognizing a single object or action out of context could be sometimes very challenging, and context information may help improve the understanding of a scene or an event greatly. However, existing approaches design specific contextual information mechanisms for different detection tasks.
In this research, we first present a comprehensive survey of context understanding in computer vision, with a taxonomy to describe context in different types and levels. Then we proposed MultiCLU, a new multi-stage context learning and utilization framework, …
Detection And Classification Of Diabetic Retinopathy Using Deep Learning Models, Aishat Olatunji
Detection And Classification Of Diabetic Retinopathy Using Deep Learning Models, Aishat Olatunji
Electronic Theses and Dissertations
Healthcare analytics leverages extensive patient data for data-driven decision-making, enhancing patient care and results. Diabetic Retinopathy (DR), a complication of diabetes, stems from damage to the retina’s blood vessels. It can affect both type 1 and type 2 diabetes patients. Ophthalmologists employ retinal images for accurate DR diagnosis and severity assessment. Early detection is crucial for preserving vision and minimizing risks. In this context, we utilized a Kaggle dataset containing patient retinal images, employing Python’s versatile tools. Our research focuses on DR detection using deep learning techniques. We used a publicly available dataset to apply our proposed neural network and …
Automatic Image-Based Nutritional Calculator App, Kejvi Cupa
Automatic Image-Based Nutritional Calculator App, Kejvi Cupa
USF Tampa Graduate Theses and Dissertations
Nutrition plays a pivotal role in shaping an individuals’ health and quality of life, making the evaluation of dietary intake crucial for promoting healthier lifestyle choices. Various solutions, particularly mobile apps, have been developed to facilitate the process of dietary estimation. Accurate nutritional intake assessment relies on two key components: ingredient recognition and food portion estimation. For a mobile app to offer a comprehensive solution for automatic nutritional assessment, it must address both components.
In this work, we focus on a mobile app pipeline: the semi-automatic pipeline which focuses on automatic food ingredient recognition. This pipeline integrates state-of-the-art models for …
Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani
Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani
Dissertations, Theses, and Capstone Projects
The understanding of human actions in videos holds immense potential for technological advancement and societal betterment. This thesis explores fundamental aspects of this field, including action recognition in trimmed clips and action localization in untrimmed videos. Trimmed videos contain only one action instance, with moments before or after the action excluded from the video. However, the majority of videos captured in unconstrained environments, often referred to as untrimmed videos, are naturally unsegmented. Untrimmed videos are typically lengthy and may encompass multiple action instances, along with the moments preceding or following each action, as well as transitions between actions. In the …
Using Pose Estimation Software To Predict Actions In Sabre Fencing, Micah Edwin Peters Ii
Using Pose Estimation Software To Predict Actions In Sabre Fencing, Micah Edwin Peters Ii
Honors College Theses
Fencing is a combat sport that uses three different swords: epee, foil, and sabre. Due to its fast-paced nature and employment of right of way, sabre fencing is often considered the most difficult of the three to learn. Computer vision and pose estimation software can be used to lower the barrier of entry to sabre fencing by identifying the different actions in sabre fencing. This project focuses on using open-source software to design a program that can identify the sabre parries as well as the main sabre movements. This program could be used to help newer fencers and spectators better …
Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron
Generalized Differentiable Neural Architecture Search With Performance And Stability Improvements, Emily J. Herron
Doctoral Dissertations
This work introduces improvements to the stability and generalizability of Cyclic DARTS (CDARTS). CDARTS is a Differentiable Architecture Search (DARTS)-based approach to neural architecture search (NAS) that uses a cyclic feedback mechanism to train search and evaluation networks concurrently, thereby optimizing the search process by enforcing that the networks produce similar outputs. However, the dissimilarity between the loss functions used by the evaluation networks during the search and retraining phases results in a search-phase evaluation network, a sub-optimal proxy for the final evaluation network utilized during retraining. ICDARTS, a revised algorithm that reformulates the search phase loss functions to ensure …
Deep Learning For Photovoltaic Characterization, Adrian Manuel De Luis Garcia
Deep Learning For Photovoltaic Characterization, Adrian Manuel De Luis Garcia
Graduate Theses and Dissertations
This thesis introduces a novel approach to Photovoltaic (PV) installation segmentation by proposing a new architecture to understand and identify PV modules from overhead imagery. Pivotal to this concept is the creation of a new Transformer-based network, S3Former, which focuses on small object characterization and modelling intra- and inter- object differentiation inside an image. Accurate mapping of PV installations is pivotal for understanding their adoption and guiding energy policy decisions. Drawing insights from current Deep Learning methodologies for image segmentation and building upon State-of-the-Art (SOTA) techniques in solar cell mapping, this work puts forth S3Former with the following enhancements: 1. …
Developing Detection And Mapping Of Roads Within Various Forms Of Media Using Opencv, Jordan C. Lyle
Developing Detection And Mapping Of Roads Within Various Forms Of Media Using Opencv, Jordan C. Lyle
Computer Science and Computer Engineering Undergraduate Honors Theses
OpenCV, and Computer Vision in general, has been a Computer Science topic that has interested me for a long time while completing my Bachelor’s degree at the University of Arkansas. As a result of this, I ended up choosing to utilize OpenCV in order to complete the task of detecting road-lines and mapping roads when given a wide variety of images. The purpose of my Honors research and this thesis is to detail the process of creating an algorithm to detect the road-lines such that the results are effective and instantaneous, as well as detail how Computer Vision can be …
Weakly-Supervised Semantic Segmentation, Zhaozheng Chen
Weakly-Supervised Semantic Segmentation, Zhaozheng Chen
Dissertations and Theses Collection (Open Access)
Semantic segmentation is a fundamental task in computer vision that assigns a label to every pixel in an image based on the semantic meaning of the objects present. It demands a large amount of pixel-level labeled images for training deep models. Weakly-supervised semantic segmentation (WSSS) is a more feasible approach that uses only weak annotations to learn the segmentation task. Image-level label based WSSS is the most challenging and popular, where only the class label for the entire image is provided as supervision. To address this challenge, Class Activation Map (CAM) has emerged as a powerful technique in WSSS. CAM …
Smartphone Based Object Detection For Shark Spotting, Darrick W. Oliver
Smartphone Based Object Detection For Shark Spotting, Darrick W. Oliver
Master's Theses
Given concern over shark attacks in coastal regions, the recent use of unmanned aerial vehicles (UAVs), or drones, has increased to ensure the safety of beachgoers. However, much of city officials' process remains manual, with drone operation and review of footage still playing a significant role. In pursuit of a more automated solution, researchers have turned to the usage of neural networks to perform detection of sharks and other marine life. For on-device solutions, this has historically required assembling individual hardware components to form an embedded system to utilize the machine learning model. This means that the camera, neural processing …
Emotion-Aware Music Recommendation, Hieu Tran, Tuan Le, Anh Do, Tram Vu, Steven Bogaerts, Brian T. Howard
Emotion-Aware Music Recommendation, Hieu Tran, Tuan Le, Anh Do, Tram Vu, Steven Bogaerts, Brian T. Howard
Computer Science Faculty publications
It is common to listen to songs that match one's mood. Thus, an AI music recommendation system that is aware of the user's emotions is likely to provide a superior user experience to one that is unaware. In this paper, we present an emotion-aware music recommendation system. Multiple models are discussed and evaluated for affect identification from a live image of the user. We propose two models: DRViT, which applies dynamic routing to vision transformers, and InvNet50, which uses involution. All considered models are trained and evaluated on the AffectNet dataset. Each model outputs the user's estimated valence and arousal …
Towards Multi-Modal Explainable Video Understanding, Kashu Yamazaki
Towards Multi-Modal Explainable Video Understanding, Kashu Yamazaki
Graduate Theses and Dissertations
This thesis presents a novel approach to video understanding by emulating human perceptual processes and creating an explainable and coherent storytelling representation of video content. Central to this approach is the development of a Visual-Linguistic (VL) feature for an interpretable video representation and the creation of a Transformer-in-Transformer (TinT) decoder for modeling intra- and inter-event coherence in a video. Drawing inspiration from the way humans comprehend scenes by breaking them down into visual and non-visual components, the proposed VL feature models a scene through three distinct modalities. These include: (i) a global visual environment, providing a broad contextual understanding of …
Decoding The Underlying Meaning Of Multimodal Hateful Memes, Ming Shan Hee, Wen Haw Chong, Roy Ka-Wei Lee
Decoding The Underlying Meaning Of Multimodal Hateful Memes, Ming Shan Hee, Wen Haw Chong, Roy Ka-Wei Lee
Research Collection School Of Computing and Information Systems
Recent studies have proposed models that yielded promising performance for the hateful meme classification task. Nevertheless, these proposed models do not generate interpretable explanations that uncover the underlying meaning and support the classification output. A major reason for the lack of explainable hateful meme methods is the absence of a hateful meme dataset that contains ground truth explanations for benchmarking or training. Intuitively, having such explanations can educate and assist content moderators in interpreting and removing flagged hateful memes. This paper address this research gap by introducing Hateful meme with Reasons Dataset (HatReD), which is a new multimodal hateful meme …
Insect Classification And Explainability From Image Data Via Deep Learning Techniques, Tanvir Hossain Bhuiyan
Insect Classification And Explainability From Image Data Via Deep Learning Techniques, Tanvir Hossain Bhuiyan
USF Tampa Graduate Theses and Dissertations
Since the dawn of the Industrial Revolution, humanity has always tried to make labor more efficient and automated, and this trend is only continuing in the modern digital age. With the advent of artificial intelligence (AI) techniques in the latter part of the 20th century, the speed and scale with which AI has been leveraged to automate tasks defy human imagination. Many people deeply entrenched in the technology field are genuinely intrigued and concerned about how AI may change many of the ways in which humans have been living for millennia. Only time will provide the answers. This dissertation is …
Automatic Identification Of Jetting Behavior In 3d Printing With Binary Classification And Anomaly Detection, Alexander Chandy
Automatic Identification Of Jetting Behavior In 3d Printing With Binary Classification And Anomaly Detection, Alexander Chandy
Honors Scholar Theses
Consistently jetting different materials from the print head of a 3D printer is a key, yet challenging task in manufacturing processes. By using active machine learning, we can efficiently predict complex diagrams that illustrate the region of printing conditions under which “desirable jetting”, “jetting”, and “no jetting” of ink occurs for different substances. However, labeling the images of printed ink droplets that are fed to the active learning model can be time intensive. Therefore, it is ideal to use computer vision to automate the classification of this image data. This classification can be broken down into two steps. In the …
Vision-Based Object Manipulation For Activities Of Daily Living (Adl) Assistance Using Assistive Robot, Md Tanzil Shahria
Vision-Based Object Manipulation For Activities Of Daily Living (Adl) Assistance Using Assistive Robot, Md Tanzil Shahria
Theses and Dissertations
Upper and lower extremity (ULE) functional deficiencies, which limit a person's ability to perform everyday tasks, have increased at an alarming rate over the past few decades. It is essential for individuals with impairments to take care of themselves without requiring a significant amount of support from other individuals. Few assistive devices are available in the market to make their life comfortable, yet controlling them sometimes becomes challenging for this group of people. Robotic devices are emerging as assistive devices to assist individuals with limited ULE functionalities in activities of daily living (ADL). As most of these devices only allow …
Prototyping A 3d Reconstruction Of Fly Photoreceptors: From Traditional Computer Vision Techniques To 3d Visualization, Kunal Jain
Graduate Masters Theses
The Drosophila fruit fly is a well-established model organism for studying vision and neural perception. In this research, we focused on segmenting and analyzing photoreceptor cells in the Drosophila retina, specifically rhabdomeres. To facilitate our study, we acquired a comprehensive 3D dataset of Drosophila retina EM images and developed an automated segmentation pipeline using computer vision techniques. We evaluated the performance of our pipeline using automated metrics on Wild Type CS 1 and Nina D1 Mutant Drosophila datasets and prototyped a 3D model of segmented cells. The generated segmentation masks and 3D models of photoreceptor cells contribute to a better …
Ai Applications On Planetary Rovers, Alexis David Pascual
Ai Applications On Planetary Rovers, Alexis David Pascual
Electronic Thesis and Dissertation Repository
The rise in the number of robotic missions to space is paving the way for the use of artificial intelligence and machine learning in the autonomy and augmentation of rover operations. For one, more rovers mean more images, and more images mean more data bandwidth required for downlinking as well as more mental bandwidth for analyzing the images. On the other hand, light-weight, low-powered microrover platforms are being developed to accommodate the drive for planetary exploration. As a result of the mass and power constraints, these microrover platforms will not carry typical navigational instruments like a stereocamera or a laser …
Reducing Negative Transfer Of Random Data In Source-Free Unsupervised Domain Adaptation, Anthony Wong
Reducing Negative Transfer Of Random Data In Source-Free Unsupervised Domain Adaptation, Anthony Wong
Electronic Thesis and Dissertation Repository
In domain adaptation, a model trained on one dataset (source domain) is applied to a different but related dataset (target domain). The most cutting-edge method is unsupervised source-free domain adaptation (SFDA), in which source data, source labels, and target labels are not available during adaptation. This thesis explores a realistic scenario where the target dataset includes some images that are unrelated to the adaptation process. This scenario can occur from errors in data collection or processing. We provide experiments and analysis to show that current state-of-the-art (SOTA) SFDA methods suffer significant performance drops under a specific domain adaptation setup when …
Bias Detector Tool For Face Datasets Using Image Recognition, Jatin Vamshi Battu
Bias Detector Tool For Face Datasets Using Image Recognition, Jatin Vamshi Battu
Master's Projects
Computer Vision has been quickly transforming the way we live and work. One of its sub- domains, i.e., Facial Recognition has also been advancing at a rapid pace. However, the development of machine learning models that power these systems has been marred by social biases, which open the door to various societal issues. The objective of this project is to address these issues and ensure that computer vision systems are unbiased and fair to all individuals. To achieve this, we have created a web tool that uses three image classifiers (implemented using CNNs) to classify images into categories based on …
Computer Vision In Adverse Conditions: Small Objects, Low-Resoltuion Images, And Edge Deployment, Raja Sunkara
Computer Vision In Adverse Conditions: Small Objects, Low-Resoltuion Images, And Edge Deployment, Raja Sunkara
Masters Theses
"Computer vision based on deep learning is an essential field that plays a significant role in object detection, image classification, semantic segmentation, instance segmentation, and other applications. However, these models face significant challenges in adverse conditions, such as small objects, low-resolution images, and edge deployment. These challenges limit the accuracy and efficiency of computer vision algorithms, making it difficult to obtain reliable results.
The primary objective of this thesis is to assess the performance of deep learning- based computer vision models in challenging conditions and provide viable solutions to overcome the obstacles. The study will specifically address three key challenges, …
Region Detection & Segmentation Of Nissl-Stained Rat Brain Tissue, Alexandro Arnal
Region Detection & Segmentation Of Nissl-Stained Rat Brain Tissue, Alexandro Arnal
Open Access Theses & Dissertations
People who analyze images of biological tissue rely on the segmentation of structures as a preliminary step. In particular, laboratories studying the rat brain delineate brain regions to position scientific findings on a brain atlas to propose hypotheses about the rat brain and, ultimately, the human brain. Our work intersects with the preliminary step of delineating regions in images of brain tissue via computational methods.
We investigate pixel-wise classification or segmentation of brain regions using ten histological images of brain tissue sections stained for Nissl substance. We present a deep learning approach that uses the fully convolutional neural network, U-Net, …
Performance Enhancement Of Hyperspectral Semantic Segmentation Leveraging Ensemble Networks, Nicholas Soucy
Performance Enhancement Of Hyperspectral Semantic Segmentation Leveraging Ensemble Networks, Nicholas Soucy
Electronic Theses and Dissertations
Hyperspectral image (HSI) semantic segmentation is a growing field within computer vision, machine learning, and forestry. Due to the separate nature of these communities, research applying deep learning techniques to ground-type semantic segmentation needs improvement, along with working to bring the research and expectations of these three communities together. Semantic segmentation consists of classifying individual pixels within the image based on the features present. Many issues need to be resolved in HSI semantic segmentation including data preprocessing, feature reduction, semantic segmentation techniques, and adversarial training. In this thesis, we tackle these challenges by employing ensemble methods for HSI semantic segmentation. …
Real–Time Semantic Segmentation For Railway Anomalies Analysis, Paul Stanik Iii
Real–Time Semantic Segmentation For Railway Anomalies Analysis, Paul Stanik Iii
UNLV Theses, Dissertations, Professional Papers, and Capstones
In the past few years, computer vision has made huge jumps due to deep learning which leverages increased computational power and access to data. The computer vision community has also embraced transparency to accelerate research progress by sharing open datasets and open source code. Access to large scale datasets and benchmark challenges propelled and opened the field. The autonomous vehicle community is a prime example. While there has been significant growth in the automotive vision community, not much has been done in the rail domain. Traditional rail inspection methods require special trains that are run during down time, have sensitive …
Generative Spatio-Temporal And Multimodal Analysis Of Neonatal Pain, Md Sirajus Salekin
Generative Spatio-Temporal And Multimodal Analysis Of Neonatal Pain, Md Sirajus Salekin
USF Tampa Graduate Theses and Dissertations
Neonates can not express their pain like an adult person. Due to the lacking of proper muscle growth and inability to express non-verbally, it is difficult to understand their emotional status. In addition, if the neonates are under any treatment or left monitored after any major surgeries (post-operative), it is more difficult to understand their pain due to the side effect of medications and the caring system (i.e. intubated, masked face, covered body with blanket, etc.). In a clinical environment, usually, bedside nurses routinely observe the neonate and measure the pain status following any standard clinical pain scale. But current …
Softskip: Empowering Multi-Modal Dynamic Pruning For Single-Stage Referring Comprehension, Dulanga Weerakoon, Vigneshwaran Subbaraju, Tuan Tran, Archan Misra
Softskip: Empowering Multi-Modal Dynamic Pruning For Single-Stage Referring Comprehension, Dulanga Weerakoon, Vigneshwaran Subbaraju, Tuan Tran, Archan Misra
Research Collection School Of Computing and Information Systems
Supporting real-time referring expression comprehension (REC) on pervasive devices is an important capability for human-AI collaborative tasks. Model pruning techniques, applied to DNN models, can enable real-time execution even on resource-constrained devices. However, existing pruning strategies are designed principally for uni-modal applications, and suffer a significant loss of accuracy when applied to REC tasks that require fusion of textual and visual inputs. We thus present a multi-modal pruning model, LGMDP, which uses language as a pivot to dynamically and judiciously select the relevant computational blocks that need to be executed. LGMDP also introduces a new SoftSkip mechanism, whereby 'skipped' visual …
Semantic-Aligned Matching For Enhanced Detr Convergence And Multi-Scale Feature Fusion, Gongjie Zhang, Zhipeng Luo, Yingchen Yu, Jiaxing Huang, Kaiwen Cui, Shijian Lu, Eric Xing
Semantic-Aligned Matching For Enhanced Detr Convergence And Multi-Scale Feature Fusion, Gongjie Zhang, Zhipeng Luo, Yingchen Yu, Jiaxing Huang, Kaiwen Cui, Shijian Lu, Eric Xing
Machine Learning Faculty Publications
The recently proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection. However, DETR suffers from slow training convergence, which hinders its applicability to various detection tasks. We observe that DETR's slow convergence is largely attributed to the difficulty in matching object queries to relevant regions due to the unaligned semantics between object queries and encoded image features. With this observation, we design Semantic-Aligned-Matching DETR++ (SAM-DETR++) to accelerate DETR's convergence and improve detection performance. The core of SAM-DETR++ is a plug-andplay module that projects object queries and encoded image features into the same feature embedding space, where …
Impact Of Movements On Facial Expression Recognition, Zhebin Yin
Impact Of Movements On Facial Expression Recognition, Zhebin Yin
Honors Theses
The ability to recognize human emotions can be a useful skill for robots. Emotion recognition can help robots understand our responses to robot movements and actions. Human emotions can be recognized through facial expressions. Facial Expression Recognition (FER) is a well-established research area, how- ever, the majority of prior research is based on static datasets of images. With robots often the subject is moving, the robot is moving, or both. The purpose of this research is to determine the impact of movement on facial expression recognition. We apply a pre-existing model for FER, which performs around 70.86% on a given …
Gesture Recognition Using Neural Networks, Ashwini Kurady
Gesture Recognition Using Neural Networks, Ashwini Kurady
Master's Projects
The advances in technology have brought in a lot of changes in the way humans go about their lives. This has enhanced the significance of Artificial Neural Networks and Computer Vision- based interactions with the world. Gesture Recognition is one of the major focus areas in Computer Vision. This involves Human Computer Interfaces (HCI) that would capture and understand human actions. In this project, we will explore how Neural Network concepts can be applied in this challenging field of Computer Vision. By leveraging the latest research for Gesture Recognition, we researched on how to capture the movement across different frames …
License Plate Image Quality Enhancement Utilizing Super Resolution Generative Adversarial Networks, Mark Moelter
License Plate Image Quality Enhancement Utilizing Super Resolution Generative Adversarial Networks, Mark Moelter
Electronic Theses and Dissertations
This thesis focuses primarily on enhancing the image quality of blurred license plates through the use of Super-Resolution Generative Adversarial Networks (SRGANs) [1]. We propose a synthetic dataset with SRGAN model to promote blurred image quality enhancement, and allow for model evaluation on a multitude of image input and output size combinations. SRGAN is mainly used for low-resolution image enhancement, but by heavily blurring the input images, the model is tested on its ability to blindly deblur and upsample images to the desired super-resolution (SR) size. The model enhances the image quality to nearly that of the reference images. The …