Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Series

Computer vision

Institution
Publication Year
Publication

Articles 1 - 30 of 97

Full-Text Articles in Physical Sciences and Mathematics

Star-Based Reachability Analysis Of Binary Neural Networks On Continuous Input, Mykhailo Ivashchenko May 2024

Star-Based Reachability Analysis Of Binary Neural Networks On Continuous Input, Mykhailo Ivashchenko

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Deep Neural Networks (DNNs) have become a popular instrument for solving various real-world problems. DNNs’ sophisticated structure allows them to learn complex representations and features. However, architecture specifics and floating-point number usage result in increased computational operations complexity. For this reason, a more lightweight type of neural networks is widely used when it comes to edge devices, such as microcomputers or microcontrollers – Binary Neural Networks (BNNs). Like other DNNs, BNNs are vulnerable to adversarial attacks; even a small perturbation to the input set may lead to an errant output. Unfortunately, only a few approaches have been proposed for verifying …


Rescape: Transforming Coral-Reefscape Images For Quantitative Analysis, Zachary Ferris, Eraldo Ribeiro, Tomofumi Nagata, Robert Van Woesik Apr 2024

Rescape: Transforming Coral-Reefscape Images For Quantitative Analysis, Zachary Ferris, Eraldo Ribeiro, Tomofumi Nagata, Robert Van Woesik

Ocean Engineering and Marine Sciences Faculty Publications

Ever since the first image of a coral reef was captured in 1885, people worldwide have been accumulating images of coral reefscapes that document the historic conditions of reefs. However, these innumerable reefscape images suffer from perspective distortion, which reduces the apparent size of distant taxa, rendering the images unusable for quantitative analysis of reef conditions. Here we solve this century-long distortion problem by developing a novel computer-vision algorithm, ReScape, which removes the perspective distortion from reefscape images by transforming them into top-down views, making them usable for quantitative analysis of reef conditions. In doing so, we demonstrate the …


A Computer Vision Solution To Cross-Cultural Food Image Classification And Nutrition Logging​, Rohan Sethi, George K. Thiruvathukal Apr 2024

A Computer Vision Solution To Cross-Cultural Food Image Classification And Nutrition Logging​, Rohan Sethi, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

The US is a culturally and ethnically diverse country, and with this diversity comes a myriad of cuisines and eating habits that expand well beyond that of western culture. Each of these meals have their own good and bad effects when it comes to the nutritional value and its potential impact on human health. Thus, there is a greater need for people to be able to access the nutritional profile of their diverse daily meals and better manage their health. A revolutionary solution to democratize food image classification and nutritional logging is using deep learning to extract that information from …


Relative Vectoring Using Dual Object Detection For Autonomous Aerial Refueling, Derek B. Worth, Jeffrey L. Choate, James Lynch, Scott L. Nykl, Clark N. Taylor Mar 2024

Relative Vectoring Using Dual Object Detection For Autonomous Aerial Refueling, Derek B. Worth, Jeffrey L. Choate, James Lynch, Scott L. Nykl, Clark N. Taylor

Faculty Publications

Once realized, autonomous aerial refueling will revolutionize unmanned aviation by removing current range and endurance limitations. Previous attempts at establishing vision-based solutions have come close but rely heavily on near perfect extrinsic camera calibrations that often change midflight. In this paper, we propose dual object detection, a technique that overcomes such requirement by transforming aerial refueling imagery directly into receiver aircraft reference frame probe-to-drogue vectors regardless of camera position and orientation. These vectors are precisely what autonomous agents need to successfully maneuver the tanker and receiver aircraft in synchronous flight during refueling operations. Our method follows a common 4-stage process …


Automatic Classification Of Activities In Classroom Videos, Jonathan K. Foster, Matthew Korban, Peter Youngs, Ginger S. Watson, Scott T. Acton Jan 2024

Automatic Classification Of Activities In Classroom Videos, Jonathan K. Foster, Matthew Korban, Peter Youngs, Ginger S. Watson, Scott T. Acton

VMASC Publications

Classroom videos are a common source of data for educational researchers studying classroom interactions as well as a resource for teacher education and professional development. Over the last several decades emerging technologies have been applied to classroom videos to record, transcribe, and analyze classroom interactions. With the rise of machine learning, we report on the development and validation of neural networks to classify instructional activities using video signals, without analyzing speech or audio features, from a large corpus of nearly 250 h of classroom videos from elementary mathematics and English language arts instruction. Results indicated that the neural networks performed …


A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari Jan 2024

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari

Computer Science Faculty Publications

Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental …


Enhanced Privacy-Enabled Face Recognition Using Κ-Identity Optimization, Ryan Karl Dec 2023

Enhanced Privacy-Enabled Face Recognition Using Κ-Identity Optimization, Ryan Karl

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Facial recognition is becoming more and more prevalent in the daily lives of the common person. Law enforcement utilizes facial recognition to find and track suspects. The newest smartphones have the ability to unlock using the user's face. Some door locks utilize facial recognition to allow correct users to enter restricted spaces. The list of applications that use facial recognition will only increase as hardware becomes more cost-effective and more computationally powerful. As this technology becomes more prevalent in our lives, it is important to understand and protect the data provided to these companies. Any data transmitted should be encrypted …


Object Recognition With Deep Neural Networks In Low-End Systems, Lillian Davis Oct 2023

Object Recognition With Deep Neural Networks In Low-End Systems, Lillian Davis

Mahurin Honors College Capstone Experience/Thesis Projects

Object recognition is an important area in computer vision. Object recognition has been advanced significantly by deep learning that unifies feature extraction and classification. In general, deep neural networks, such as Convolution Neural Networks (CNNs), are trained in high-performance systems. Aiming to extend the reach of deep learning to personal computing, I propose a study of deep learning-based object recognition in low-end systems, such as laptops. This research includes how differing layer configurations and hyperparameter values used in CNNs can either create or resolve the issue of overfitting and affect final accuracy levels of object recognition systems. The main contribution …


Pymaivar: An Open-Source Python Suit For Audio-Image Representation In Human Action Recognition, Muhammad B. Shaikh, Douglas Chai, Syed M. S. Islam, Naveed Akhtar Sep 2023

Pymaivar: An Open-Source Python Suit For Audio-Image Representation In Human Action Recognition, Muhammad B. Shaikh, Douglas Chai, Syed M. S. Islam, Naveed Akhtar

Research outputs 2022 to 2026

We present PyMAiVAR, a versatile toolbox that encompasses the generation of image representations for audio data including Wave plots, Spectral Centroids, Spectral Roll Offs, Mel Frequency Cepstral Coefficients (MFCC), MFCC Feature Scaling, and Chromagrams. This wide-ranging toolkit generates rich audio-image representations, playing a pivotal role in reshaping human action recognition. By fully exploiting audio data's latent potential, PyMAiVAR stands as a significant advancement in the field. The package is implemented in Python and can be used across different operating systems.


Multiclass Confidence And Localization Calibration For Object Detection, Bimsara Pathiraja, Malitha Gunawardhana, Muhammad Haris Khan Aug 2023

Multiclass Confidence And Localization Calibration For Object Detection, Bimsara Pathiraja, Malitha Gunawardhana, Muhammad Haris Khan

Computer Vision Faculty Publications

Albeit achieving high predictive accuracy across many challenging computer vision problems, recent studies suggest that deep neural networks (DNNs) tend to make over-confident predictions, rendering them poorly calibrated. Most of the existing attempts for improving DNN calibration are limited to classification tasks and restricted to calibrating in-domain predictions. Surprisingly, very little to no attempts have been made in studying the calibration of object detection methods, which occupy a pivotal space in vision-based security-sensitive, and safety-critical applications. In this paper, we propose a new train-time technique for calibrating modern object detection methods. It is capable of jointly calibrating multiclass confidence and …


3d-Aware Multi-Class Image-To-Image Translation With Nerfs, Senmao Li, Joost Van De Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang Aug 2023

3d-Aware Multi-Class Image-To-Image Translation With Nerfs, Senmao Li, Joost Van De Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang

Computer Vision Faculty Publications

Recent advances in 3D-aware generative models (3D-aware GANs) combined with Neural Radiance Fields (NeRF) have achieved impressive results. However no prior works investigate 3D-aware GANs for 3D consistent multiclass image-to-image (3D-aware 121) translation. Naively using 2D-121 translation methods suffers from unrealistic shape/identity change. To perform 3D-aware multiclass 121 translation, we decouple this learning process into a multiclass 3D-aware GAN step and a 3D-aware 121 translation step. In the first step, we propose two novel techniques: a new conditional architecture and an effective training strategy. In the second step, based on the well-trained multiclass 3D-aware GAN architecture, that preserves view-consistency, we …


Discriminative Co-Saliency And Background Mining Transformer For Co-Salient Object Detection, Long Li, Junwei Han, Ni Zhang, Nian Liu, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan Aug 2023

Discriminative Co-Saliency And Background Mining Transformer For Co-Salient Object Detection, Long Li, Junwei Han, Ni Zhang, Nian Liu, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan

Computer Vision Faculty Publications

Most previous co-salient object detection works mainly focus on extracting co-salient cues via mining the consistency relations across images while ignore explicit exploration of background regions. In this paper, we propose a Discriminative co-saliency and background Mining Transformer framework (DMT) based on several economical multi-grained correlation modules to explicitly mine both co-saliency and background information and effectively model their discrimination. Specifically, we first propose a region-to-region correlation module for introducing inter-image relations to pixel-wise segmentation features while maintaining computational efficiency. Then, we use two types of pre-defined tokens to mine co-saliency and background information via our proposed contrast-induced pixel-to-token correlation …


Smartbrush: Text And Shape Guided Object Inpainting With Diffusion Model, Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang Aug 2023

Smartbrush: Text And Shape Guided Object Inpainting With Diffusion Model, Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang

Machine Learning Faculty Publications

Generic image inpainting aims to complete a corrupted image by borrowing surrounding information, which barely generates novel content. By contrast, multi-modal inpainting provides more flexible and useful controls on the inpainted content, e.g., a text prompt can be used to describe an object with richer attributes, and a mask can be used to constrain the shape of the inpainted object rather than being only considered as a missing area. We propose a new diffusion-based model named SmartBrush for completing a missing region with an object using both text and shape-guidance. While previous work such as DALLE-2 and Stable Diffusion can …


Accuracy Vs. Energy: An Assessment Of Bee Object Inference In Videos From On-Hive Video Loggers With Yolov3, Yolov4-Tiny, And Yolov7-Tiny, Vladimir A. Kulyukin, Aleksey V. Kulyukin Jul 2023

Accuracy Vs. Energy: An Assessment Of Bee Object Inference In Videos From On-Hive Video Loggers With Yolov3, Yolov4-Tiny, And Yolov7-Tiny, Vladimir A. Kulyukin, Aleksey V. Kulyukin

Computer Science Faculty and Staff Publications

A continuing trend in precision apiculture is to use computer vision methods to quantify characteristics of bee traffic in managed colonies at the hive's entrance. Since traffic at the hive's entrance is a contributing factor to the hive's productivity and health, we assessed the potential of three open-source convolutional network models, YOLOv3, YOLOv4-tiny, and YOLOv7-tiny, to quantify omnidirectional traffic in videos from on-hive video loggers on regular, unmodified one- and two-super Langstroth hives and compared their accuracies, energy efficacies, and operational energy footprints. We trained and tested the models with a 70/30 split on a dataset of 23,173 flying bees …


Fine-Grained Domain Adaptive Crowd Counting Via Point-Derived Segmentation, Yongtuo Liu, Dan Xu, Sucheng Ren, Hanjie Wu, Hongmin Cai, Shengfeng He Jul 2023

Fine-Grained Domain Adaptive Crowd Counting Via Point-Derived Segmentation, Yongtuo Liu, Dan Xu, Sucheng Ren, Hanjie Wu, Hongmin Cai, Shengfeng He

Research Collection School Of Computing and Information Systems

Due to domain shift, a large performance drop is usually observed when a trained crowd counting model is deployed in the wild. While existing domain-adaptive crowd counting methods achieve promising results, they typically regard each crowd image as a whole and reduce domain discrepancies in a holistic manner, thus limiting further improvement of domain adaptation performance. To this end, we propose to untangle domain-invariant crowd and domain-specific background from crowd images and design a fine-grained domain adaption method for crowd counting. Specifically, to disentangle crowd from background, we propose to learn crowd segmentation from point-level crowd counting annotations in a …


A Novel Driver Emotion Recognition System Based On Deep Ensemble Classification, Khalid Zaman, Sun Zhaoyun, Babar Shah, Tariq Hussain, Sayyed Mudassar Shah, Farman Ali, Umer Sadiq Khan Jun 2023

A Novel Driver Emotion Recognition System Based On Deep Ensemble Classification, Khalid Zaman, Sun Zhaoyun, Babar Shah, Tariq Hussain, Sayyed Mudassar Shah, Farman Ali, Umer Sadiq Khan

All Works

Driver emotion classification is an important topic that can raise awareness of driving habits because many drivers are overconfident and unaware of their bad driving habits. Drivers will acquire insight into their poor driving behaviors and be better able to avoid future accidents if their behavior is automatically identified. In this paper, we use different models such as convolutional neural networks, recurrent neural networks, and multi-layer perceptron classification models to construct an ensemble convolutional neural network-based enhanced driver facial expression recognition model. First, the faces of the drivers are discovered using the faster region-based convolutional neural network (R-CNN) model, which …


Curricular Contrastive Regularization For Physics-Aware Single Image Dehazing, Yu Zheng, Jiahui Zhan, Shengfeng He, Yong Du Jun 2023

Curricular Contrastive Regularization For Physics-Aware Single Image Dehazing, Yu Zheng, Jiahui Zhan, Shengfeng He, Yong Du

Research Collection School Of Computing and Information Systems

Considering the ill-posed nature, contrastive regularization has been developed for single image dehazing, introducing the information from negative images as a lower bound. However, the contrastive samples are non-consensual, as the negatives are usually represented distantly from the clear (i.e., positive) image, leaving the solution space still under-constricted. Moreover, the interpretability of deep dehazing models is underexplored towards the physics of the hazing process. In this paper, we propose a novel curricular contrastive regularization targeted at a consensual contrastive space as opposed to a non-consensual one. Our negatives, which provide better lower-bound constraints, can be assembled from 1) the hazy …


Tree-Based Unidirectional Neural Networks For Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Amy Wang, Jamie C. Davis, George K. Thiruvathukal, Yung-Hisang Lu Jun 2023

Tree-Based Unidirectional Neural Networks For Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Amy Wang, Jamie C. Davis, George K. Thiruvathukal, Yung-Hisang Lu

Computer Science: Faculty Publications and Other Works

This article describes the novel Tree-based Unidirectional Neural Network (TRUNK) architecture. This architecture improves computer vision efficiency by using a hierarchy of multiple shallow Convolutional Neural Networks (CNNs), instead of a single very deep CNN. We demonstrate this architecture’s versatility in performing different computer vision tasks efficiently on embedded devices. Across various computer vision tasks, the TRUNK architecture consumes 65% less energy and requires 50% less memory than representative low-power CNN architectures, e.g., MobileNet v2, when deployed on the NVIDIA Jetson Nano.


Where Is My Spot? Few-Shot Image Generation Via Latent Subspace Optimization, Chenxi Zheng, Bangzhen Liu, Huaidong Zhang, Xuemiao Xu, Shengfeng He Jun 2023

Where Is My Spot? Few-Shot Image Generation Via Latent Subspace Optimization, Chenxi Zheng, Bangzhen Liu, Huaidong Zhang, Xuemiao Xu, Shengfeng He

Research Collection School Of Computing and Information Systems

Image generation relies on massive training data that can hardly produce diverse images of an unseen category according to a few examples. In this paper, we address this dilemma by projecting sparse few-shot samples into a continuous latent space that can potentially generate infinite unseen samples. The rationale behind is that we aim to locate a centroid latent position in a conditional StyleGAN, where the corresponding output image on that centroid can maximize the similarity with the given samples. Although the given samples are unseen for the conditional StyleGAN, we assume the neighboring latent subspace around the centroid belongs to …


Bubbleu: Exploring Augmented Reality Game Design With Uncertain Ai-Based Interaction, Minji Kim, Kyungjin Lee, Rajesh Krishna Balan, Youngki Lee Apr 2023

Bubbleu: Exploring Augmented Reality Game Design With Uncertain Ai-Based Interaction, Minji Kim, Kyungjin Lee, Rajesh Krishna Balan, Youngki Lee

Research Collection School Of Computing and Information Systems

Object detection, while being an attractive interaction method for Augmented Reality (AR), is fundamentally error-prone due to the probabilistic nature of the underlying AI models, resulting in sub-optimal user experiences. In this paper, we explore the effect of three game design concepts, Ambiguity, Transparency, and Controllability, to provide better gameplay experiences in AR games that use error-prone object detection-based interaction modalities. First, we developed a base AR pet breeding game, called Bubbleu that uses object detection as a key interaction method. We then implemented three different variants, each according to the three concepts, to investigate the impact of each design …


Observing Human Mobility Internationally During Covid-19, Shane Allcroft, Mohammed Metwaly, Zachery Berg, Isha Ghodgaonkar, Fischer Bordwell, Xinxin Zhao, Xinglei Liu, Jiahao Xu, Subhankar Chakraborty, Vishnu Banna, Akhil Chinnakotla, Abhinav Goel, Caleb Tung, Gore Kao, Wei Zakharov, David A. Shoham, George K. Thiruvathukal, Yung-Hsiang Lu Mar 2023

Observing Human Mobility Internationally During Covid-19, Shane Allcroft, Mohammed Metwaly, Zachery Berg, Isha Ghodgaonkar, Fischer Bordwell, Xinxin Zhao, Xinglei Liu, Jiahao Xu, Subhankar Chakraborty, Vishnu Banna, Akhil Chinnakotla, Abhinav Goel, Caleb Tung, Gore Kao, Wei Zakharov, David A. Shoham, George K. Thiruvathukal, Yung-Hsiang Lu

Computer Science: Faculty Publications and Other Works

This article analyzes visual data captured from five countries and three U.S. states to evaluate the effectiveness of lockdown policies for reducing the spread of COVID-19. The main challenge is the scale: nearly six million images are analyzed to observe how people respond to the policy changes.


Pose- And Attribute-Consistent Person Image Synthesis, Cheng Xu, Zejun Chen, Jiajie Mai, Xuemiao Xu, Shengfeng He Feb 2023

Pose- And Attribute-Consistent Person Image Synthesis, Cheng Xu, Zejun Chen, Jiajie Mai, Xuemiao Xu, Shengfeng He

Research Collection School Of Computing and Information Systems

PersonImageSynthesisaimsattransferringtheappearanceofthesourcepersonimageintoatargetpose. Existingmethods cannot handle largeposevariations and therefore suffer fromtwocritical problems: (1)synthesisdistortionduetotheentanglementofposeandappearanceinformationamongdifferentbody componentsand(2)failureinpreservingoriginalsemantics(e.g.,thesameoutfit).Inthisarticle,weexplicitly addressthesetwoproblemsbyproposingaPose-andAttribute-consistentPersonImageSynthesisNetwork (PAC-GAN).Toreduceposeandappearancematchingambiguity,weproposeacomponent-wisetransferring modelconsistingoftwostages.Theformerstagefocusesonlyonsynthesizingtargetposes,whilethelatter renderstargetappearancesbyexplicitlytransferringtheappearanceinformationfromthesourceimageto thetargetimageinacomponent-wisemanner. Inthisway,source-targetmatchingambiguityiseliminated duetothecomponent-wisedisentanglementofposeandappearancesynthesis.Second,tomaintainattribute consistency,werepresenttheinputimageasanattributevectorandimposeahigh-levelsemanticconstraint usingthisvectortoregularizethetargetsynthesis.ExtensiveexperimentalresultsontheDeepFashiondataset demonstratethesuperiorityofourmethodoverthestateoftheart,especiallyformaintainingposeandattributeconsistenciesunderlargeposevariations.


Towards A Framework For Privacy-Preserving Pedestrian Analysis, Anil Kunchala, Mélanie Bouroche, Bianca Schoen-Phelan Jan 2023

Towards A Framework For Privacy-Preserving Pedestrian Analysis, Anil Kunchala, Mélanie Bouroche, Bianca Schoen-Phelan

Conference papers

The design of pedestrian-friendly infrastructures plays a crucial role in creating sustainable transportation in urban environments. Analyzing pedestrian behaviour in response to existing infrastructure is pivotal to planning, maintaining, and creating more pedestrian-friendly facilities. Many approaches have been proposed to extract such behaviour by applying deep learning models to video data. Video data, however, includes an broad spectrum of privacy-sensitive information about individuals, such as their location at a given time or who they are with. Most of the existing models use privacy-invasive methodologies to track, detect, and analyse individual or group pedestrian behaviour patterns. As a step towards privacy-preserving …


A Multistage Framework For Detection Of Very Small Objects, Duleep Rathgamage Don, Ramazan Aygun, Mahmut Karakaya Jan 2023

A Multistage Framework For Detection Of Very Small Objects, Duleep Rathgamage Don, Ramazan Aygun, Mahmut Karakaya

Published and Grey Literature from PhD Candidates

Small object detection is one of the most challenging problems in computer vision. Algorithms based on state-of-the-art object detection methods such as R-CNN, SSD, FPN, and YOLO fail to detect objects of very small sizes. In this study, we propose a novel method to detect very small objects, smaller than 8×8 pixels, that appear in a complex background. The proposed method is a multistage framework consisting of an unsupervised algorithm and three separately trained supervised algorithms. The unsupervised algorithm extracts ROIs from a high-resolution image. Then the ROIs are upsampled using SRGAN, and the enhanced ROIs are detected by our …


Evolution Of Winning Solutions In The 2021 Low-Power Computer Vision Challenge, Xiao Hu, Ziteng Jiao, Ayden Kocher, Zhenyu Wu, Junjie Liu, James C. Davis, George K. Thiruvathukal, Yung-Hsiang Lu Jan 2023

Evolution Of Winning Solutions In The 2021 Low-Power Computer Vision Challenge, Xiao Hu, Ziteng Jiao, Ayden Kocher, Zhenyu Wu, Junjie Liu, James C. Davis, George K. Thiruvathukal, Yung-Hsiang Lu

Computer Science: Faculty Publications and Other Works

Mobile and embedded devices are becoming ubiquitous. Applications such as rescue with autonomous robots and event analysis on traffic cameras rely on devices with limited power supply and computational sources. Thus, the demand for efficient computer vision algorithms increases. Since 2015, we have organized the IEEE Low-Power Computer Vision Challenge to advance the state of the art in low-power computer vision. We describe the competition organizing details including the challenge design, the reference solution, the dataset, the referee system, and the evolution of the solutions from two winning teams. We examine the winning teams’ development patterns and design decisions, focusing …


Intellibeehive: An Automated Honey Bee, Pollen, And Varroa Destructor Monitoring System, Christian I. Narcia-Macias, Joselito Guardado, Jocell Rodriguez, Joanne Rampersad, Erik Enriquez, Dong-Chul Kim Jan 2023

Intellibeehive: An Automated Honey Bee, Pollen, And Varroa Destructor Monitoring System, Christian I. Narcia-Macias, Joselito Guardado, Jocell Rodriguez, Joanne Rampersad, Erik Enriquez, Dong-Chul Kim

Computer Science Faculty Publications and Presentations

Utilizing computer vision and the latest technological advancements, in this study, we developed a honey bee monitoring system that aims to enhance our understanding of Colony Collapse Disorder, honey bee behavior, population decline, and overall hive health. The system is positioned at the hive entrance providing real-time data, enabling beekeepers to closely monitor the hive's activity and health through an account-based website. Using machine learning, our monitoring system can accurately track honey bees, monitor pollen-gathering activity, and detect Varroa mites, all without causing any disruption to the honey bees. Moreover, we have ensured that the development of this monitoring system …


Towards A Machine Learning-Based Digital Twin For Non-Invasive Human Bio-Signal Fusion, Izaldein Al-Zyoud, Fedwa Laamarti, Xiaocong Ma, Diana Tobón, Abdulmotaleb Elsaddik Dec 2022

Towards A Machine Learning-Based Digital Twin For Non-Invasive Human Bio-Signal Fusion, Izaldein Al-Zyoud, Fedwa Laamarti, Xiaocong Ma, Diana Tobón, Abdulmotaleb Elsaddik

Computer Vision Faculty Publications

Human bio-signal fusion is considered a critical technological solution that needs to be advanced to enable modern and secure digital health and well-being applications in the metaverse. To support such efforts, we propose a new data-driven digital twin (DT) system to fuse three human physiological bio-signals: heart rate (HR), breathing rate (BR), and blood oxygen saturation level (SpO2). To accomplish this goal, we design a computer vision technology based on the non-invasive photoplethysmography (PPG) technique to extract raw time-series bio-signal data from facial video frames. Then, we implement machine learning (ML) technology to model and measure the bio-signals. We accurately …


Transresnet: Integrating The Strengths Of Vits And Cnns For High Resolution Medical Image Segmentation Via Feature Grafting, Muhammad Hamza Sharif, Dmitry Demidov, Asif Hanif, Mohammad Yaqub, Min Xu Nov 2022

Transresnet: Integrating The Strengths Of Vits And Cnns For High Resolution Medical Image Segmentation Via Feature Grafting, Muhammad Hamza Sharif, Dmitry Demidov, Asif Hanif, Mohammad Yaqub, Min Xu

Computer Vision Faculty Publications

High-resolution images are preferable in medical imaging domain as they significantly improve the diagnostic capability of the underlying method. In particular, high resolution helps substantially in improving automatic image segmentation. However, most of the existing deep learning-based techniques for medical image segmentation are optimized for input images having small spatial dimensions and perform poorly on high-resolution images. To address this shortcoming, we propose a parallel-in-branch architecture called TransResNet, which incorporates Transformer and CNN in a parallel manner to extract features from multi-resolution images independently. In TransResNet, we introduce Cross Grafting Module (CGM), which generates the grafted features, enriched in both …


A Robust Normalizing Flow Using Bernstein-Type Polynomials, Sameera Ramasinghe, Kasun Fernando, Salman Khan, Nick Barnes Nov 2022

A Robust Normalizing Flow Using Bernstein-Type Polynomials, Sameera Ramasinghe, Kasun Fernando, Salman Khan, Nick Barnes

Computer Vision Faculty Publications

Modeling real-world distributions can often be challenging due to sample data that are subjected to perturbations, e.g., instrumentation errors, or added random noise. Since flow models are typically nonlinear algorithms, they amplify these initial errors, leading to poor generalizations. This paper proposes a framework to construct Normalizing Flows (NFs) which demonstrate higher robustness against such initial errors. To this end, we utilize Bernstein-type polynomials inspired by the optimal stability of the Bernstein basis. Further, compared to the existing NF frameworks, our method provides compelling advantages like theoretical upper bounds for the approximation error, better suitability for compactly supported densities, and …


Face Pyramid Vision Transformer, Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood Nov 2022

Face Pyramid Vision Transformer, Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood

Computer Vision Faculty Publications

A novel Face Pyramid Vision Transformer (FPVT) is proposed to learn a discriminative multi-scale facial representations for face recognition and verification. In FPVT, Face Spatial Reduction Attention (FSRA) and Dimensionality Reduction (FDR) layers are employed to make the feature maps compact, thus reducing the computations. An Improved Patch Embedding (IPE) algorithm is proposed to exploit the benefits of CNNs in ViTs (e.g., shared weights, local context, and receptive fields) to model lower-level edges to higher-level semantic primitives. Within FPVT framework, a Convolutional Feed-Forward Network (CFFN) is proposed that extracts locality information to learn low level facial information. The proposed FPVT …