Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Artificial Intelligence and Robotics

Computer vision

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 115

Full-Text Articles in Physical Sciences and Mathematics

A Computer Vision Solution To Cross-Cultural Food Image Classification And Nutrition Logging​, Rohan Sethi, George K. Thiruvathukal Apr 2024

A Computer Vision Solution To Cross-Cultural Food Image Classification And Nutrition Logging​, Rohan Sethi, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

The US is a culturally and ethnically diverse country, and with this diversity comes a myriad of cuisines and eating habits that expand well beyond that of western culture. Each of these meals have their own good and bad effects when it comes to the nutritional value and its potential impact on human health. Thus, there is a greater need for people to be able to access the nutritional profile of their diverse daily meals and better manage their health. A revolutionary solution to democratize food image classification and nutritional logging is using deep learning to extract that information from …


Automatic Classification Of Activities In Classroom Videos, Jonathan K. Foster, Matthew Korban, Peter Youngs, Ginger S. Watson, Scott T. Acton Jan 2024

Automatic Classification Of Activities In Classroom Videos, Jonathan K. Foster, Matthew Korban, Peter Youngs, Ginger S. Watson, Scott T. Acton

VMASC Publications

Classroom videos are a common source of data for educational researchers studying classroom interactions as well as a resource for teacher education and professional development. Over the last several decades emerging technologies have been applied to classroom videos to record, transcribe, and analyze classroom interactions. With the rise of machine learning, we report on the development and validation of neural networks to classify instructional activities using video signals, without analyzing speech or audio features, from a large corpus of nearly 250 h of classroom videos from elementary mathematics and English language arts instruction. Results indicated that the neural networks performed …


A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari Jan 2024

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari

Computer Science Faculty Publications

Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental …


Implementation Of Adas And Autonomy On Unlv Campus, Zillur Rahman Dec 2023

Implementation Of Adas And Autonomy On Unlv Campus, Zillur Rahman

UNLV Theses, Dissertations, Professional Papers, and Capstones

The integration of Advanced Driving Assistance Systems (ADAS) and autonomous driving functionalities into contemporary vehicles has notably surged, driven by the remarkable progress in artificial intelligence (AI). These AI systems, capable of learning from real-world data, now exhibit the capability to perceive their surroundings via a suite of sensors, create optimal routes from source to destination, and execute vehicle control akin to a human driver.

Within the context of this thesis, we undertake a comprehensive exploration of three distinct yet interrelated ADAS and Autonomy projects. Our central objective is the implementation of autonomous driving(AD) technology at UNLV campus, culminating in …


Smart Street Light Control: A Review On Methods, Innovations, And Extended Applications, Fouad Agramelal, Mohamed Sadik, Youssef Moubarak, Saad Abouzahir Nov 2023

Smart Street Light Control: A Review On Methods, Innovations, And Extended Applications, Fouad Agramelal, Mohamed Sadik, Youssef Moubarak, Saad Abouzahir

Computer Vision Faculty Publications

As urbanization increases, streetlights have become significant consumers of electrical power, making it imperative to develop effective control methods for sustainability. This paper offers a comprehensive review on control methods of smart streetlight systems, setting itself apart by introducing a novel light scheme framework that provides a structured classification of various light control patterns, thus filling an existing gap in the literature. Unlike previous studies, this work dives into the technical specifics of individual research papers and methodologies, ranging from basic to advanced control methods like computer vision and deep learning, while also assessing the energy consumption associated with each …


Object Recognition With Deep Neural Networks In Low-End Systems, Lillian Davis Oct 2023

Object Recognition With Deep Neural Networks In Low-End Systems, Lillian Davis

Mahurin Honors College Capstone Experience/Thesis Projects

Object recognition is an important area in computer vision. Object recognition has been advanced significantly by deep learning that unifies feature extraction and classification. In general, deep neural networks, such as Convolution Neural Networks (CNNs), are trained in high-performance systems. Aiming to extend the reach of deep learning to personal computing, I propose a study of deep learning-based object recognition in low-end systems, such as laptops. This research includes how differing layer configurations and hyperparameter values used in CNNs can either create or resolve the issue of overfitting and affect final accuracy levels of object recognition systems. The main contribution …


Multiclass Confidence And Localization Calibration For Object Detection, Bimsara Pathiraja, Malitha Gunawardhana, Muhammad Haris Khan Aug 2023

Multiclass Confidence And Localization Calibration For Object Detection, Bimsara Pathiraja, Malitha Gunawardhana, Muhammad Haris Khan

Computer Vision Faculty Publications

Albeit achieving high predictive accuracy across many challenging computer vision problems, recent studies suggest that deep neural networks (DNNs) tend to make over-confident predictions, rendering them poorly calibrated. Most of the existing attempts for improving DNN calibration are limited to classification tasks and restricted to calibrating in-domain predictions. Surprisingly, very little to no attempts have been made in studying the calibration of object detection methods, which occupy a pivotal space in vision-based security-sensitive, and safety-critical applications. In this paper, we propose a new train-time technique for calibrating modern object detection methods. It is capable of jointly calibrating multiclass confidence and …


3d-Aware Multi-Class Image-To-Image Translation With Nerfs, Senmao Li, Joost Van De Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang Aug 2023

3d-Aware Multi-Class Image-To-Image Translation With Nerfs, Senmao Li, Joost Van De Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang

Computer Vision Faculty Publications

Recent advances in 3D-aware generative models (3D-aware GANs) combined with Neural Radiance Fields (NeRF) have achieved impressive results. However no prior works investigate 3D-aware GANs for 3D consistent multiclass image-to-image (3D-aware 121) translation. Naively using 2D-121 translation methods suffers from unrealistic shape/identity change. To perform 3D-aware multiclass 121 translation, we decouple this learning process into a multiclass 3D-aware GAN step and a 3D-aware 121 translation step. In the first step, we propose two novel techniques: a new conditional architecture and an effective training strategy. In the second step, based on the well-trained multiclass 3D-aware GAN architecture, that preserves view-consistency, we …


Discriminative Co-Saliency And Background Mining Transformer For Co-Salient Object Detection, Long Li, Junwei Han, Ni Zhang, Nian Liu, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan Aug 2023

Discriminative Co-Saliency And Background Mining Transformer For Co-Salient Object Detection, Long Li, Junwei Han, Ni Zhang, Nian Liu, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan

Computer Vision Faculty Publications

Most previous co-salient object detection works mainly focus on extracting co-salient cues via mining the consistency relations across images while ignore explicit exploration of background regions. In this paper, we propose a Discriminative co-saliency and background Mining Transformer framework (DMT) based on several economical multi-grained correlation modules to explicitly mine both co-saliency and background information and effectively model their discrimination. Specifically, we first propose a region-to-region correlation module for introducing inter-image relations to pixel-wise segmentation features while maintaining computational efficiency. Then, we use two types of pre-defined tokens to mine co-saliency and background information via our proposed contrast-induced pixel-to-token correlation …


Autonomous Shipwreck Detection & Mapping, William Ard Aug 2023

Autonomous Shipwreck Detection & Mapping, William Ard

LSU Master's Theses

This thesis presents the development and testing of Bruce, a low-cost hybrid Remote Operated Vehicle (ROV) / Autonomous Underwater Vehicle (AUV) system for the optical survey of marine archaeological sites, as well as a novel sonar image augmentation strategy for semantic segmentation of shipwrecks. This approach takes side-scan sonar and bathymetry data collected using an EdgeTech 2205 AUV sensor integrated with an Harris Iver3, and generates augmented image data to be used for the semantic segmentation of shipwrecks. It is shown that, due to the feature enhancement capabilities of the proposed shipwreck detection strategy, correctly identified areas have a 15% …


Fine-Grained Domain Adaptive Crowd Counting Via Point-Derived Segmentation, Yongtuo Liu, Dan Xu, Sucheng Ren, Hanjie Wu, Hongmin Cai, Shengfeng He Jul 2023

Fine-Grained Domain Adaptive Crowd Counting Via Point-Derived Segmentation, Yongtuo Liu, Dan Xu, Sucheng Ren, Hanjie Wu, Hongmin Cai, Shengfeng He

Research Collection School Of Computing and Information Systems

Due to domain shift, a large performance drop is usually observed when a trained crowd counting model is deployed in the wild. While existing domain-adaptive crowd counting methods achieve promising results, they typically regard each crowd image as a whole and reduce domain discrepancies in a holistic manner, thus limiting further improvement of domain adaptation performance. To this end, we propose to untangle domain-invariant crowd and domain-specific background from crowd images and design a fine-grained domain adaption method for crowd counting. Specifically, to disentangle crowd from background, we propose to learn crowd segmentation from point-level crowd counting annotations in a …


Curricular Contrastive Regularization For Physics-Aware Single Image Dehazing, Yu Zheng, Jiahui Zhan, Shengfeng He, Yong Du Jun 2023

Curricular Contrastive Regularization For Physics-Aware Single Image Dehazing, Yu Zheng, Jiahui Zhan, Shengfeng He, Yong Du

Research Collection School Of Computing and Information Systems

Considering the ill-posed nature, contrastive regularization has been developed for single image dehazing, introducing the information from negative images as a lower bound. However, the contrastive samples are non-consensual, as the negatives are usually represented distantly from the clear (i.e., positive) image, leaving the solution space still under-constricted. Moreover, the interpretability of deep dehazing models is underexplored towards the physics of the hazing process. In this paper, we propose a novel curricular contrastive regularization targeted at a consensual contrastive space as opposed to a non-consensual one. Our negatives, which provide better lower-bound constraints, can be assembled from 1) the hazy …


Where Is My Spot? Few-Shot Image Generation Via Latent Subspace Optimization, Chenxi Zheng, Bangzhen Liu, Huaidong Zhang, Xuemiao Xu, Shengfeng He Jun 2023

Where Is My Spot? Few-Shot Image Generation Via Latent Subspace Optimization, Chenxi Zheng, Bangzhen Liu, Huaidong Zhang, Xuemiao Xu, Shengfeng He

Research Collection School Of Computing and Information Systems

Image generation relies on massive training data that can hardly produce diverse images of an unseen category according to a few examples. In this paper, we address this dilemma by projecting sparse few-shot samples into a continuous latent space that can potentially generate infinite unseen samples. The rationale behind is that we aim to locate a centroid latent position in a conditional StyleGAN, where the corresponding output image on that centroid can maximize the similarity with the given samples. Although the given samples are unseen for the conditional StyleGAN, we assume the neighboring latent subspace around the centroid belongs to …


Tree-Based Unidirectional Neural Networks For Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Amy Wang, Jamie C. Davis, George K. Thiruvathukal, Yung-Hisang Lu Jun 2023

Tree-Based Unidirectional Neural Networks For Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Amy Wang, Jamie C. Davis, George K. Thiruvathukal, Yung-Hisang Lu

Computer Science: Faculty Publications and Other Works

This article describes the novel Tree-based Unidirectional Neural Network (TRUNK) architecture. This architecture improves computer vision efficiency by using a hierarchy of multiple shallow Convolutional Neural Networks (CNNs), instead of a single very deep CNN. We demonstrate this architecture’s versatility in performing different computer vision tasks efficiently on embedded devices. Across various computer vision tasks, the TRUNK architecture consumes 65% less energy and requires 50% less memory than representative low-power CNN architectures, e.g., MobileNet v2, when deployed on the NVIDIA Jetson Nano.


Bubbleu: Exploring Augmented Reality Game Design With Uncertain Ai-Based Interaction, Minji Kim, Kyungjin Lee, Rajesh Krishna Balan, Youngki Lee Apr 2023

Bubbleu: Exploring Augmented Reality Game Design With Uncertain Ai-Based Interaction, Minji Kim, Kyungjin Lee, Rajesh Krishna Balan, Youngki Lee

Research Collection School Of Computing and Information Systems

Object detection, while being an attractive interaction method for Augmented Reality (AR), is fundamentally error-prone due to the probabilistic nature of the underlying AI models, resulting in sub-optimal user experiences. In this paper, we explore the effect of three game design concepts, Ambiguity, Transparency, and Controllability, to provide better gameplay experiences in AR games that use error-prone object detection-based interaction modalities. First, we developed a base AR pet breeding game, called Bubbleu that uses object detection as a key interaction method. We then implemented three different variants, each according to the three concepts, to investigate the impact of each design …


Observing Human Mobility Internationally During Covid-19, Shane Allcroft, Mohammed Metwaly, Zachery Berg, Isha Ghodgaonkar, Fischer Bordwell, Xinxin Zhao, Xinglei Liu, Jiahao Xu, Subhankar Chakraborty, Vishnu Banna, Akhil Chinnakotla, Abhinav Goel, Caleb Tung, Gore Kao, Wei Zakharov, David A. Shoham, George K. Thiruvathukal, Yung-Hsiang Lu Mar 2023

Observing Human Mobility Internationally During Covid-19, Shane Allcroft, Mohammed Metwaly, Zachery Berg, Isha Ghodgaonkar, Fischer Bordwell, Xinxin Zhao, Xinglei Liu, Jiahao Xu, Subhankar Chakraborty, Vishnu Banna, Akhil Chinnakotla, Abhinav Goel, Caleb Tung, Gore Kao, Wei Zakharov, David A. Shoham, George K. Thiruvathukal, Yung-Hsiang Lu

Computer Science: Faculty Publications and Other Works

This article analyzes visual data captured from five countries and three U.S. states to evaluate the effectiveness of lockdown policies for reducing the spread of COVID-19. The main challenge is the scale: nearly six million images are analyzed to observe how people respond to the policy changes.


Pose- And Attribute-Consistent Person Image Synthesis, Cheng Xu, Zejun Chen, Jiajie Mai, Xuemiao Xu, Shengfeng He Feb 2023

Pose- And Attribute-Consistent Person Image Synthesis, Cheng Xu, Zejun Chen, Jiajie Mai, Xuemiao Xu, Shengfeng He

Research Collection School Of Computing and Information Systems

PersonImageSynthesisaimsattransferringtheappearanceofthesourcepersonimageintoatargetpose. Existingmethods cannot handle largeposevariations and therefore suffer fromtwocritical problems: (1)synthesisdistortionduetotheentanglementofposeandappearanceinformationamongdifferentbody componentsand(2)failureinpreservingoriginalsemantics(e.g.,thesameoutfit).Inthisarticle,weexplicitly addressthesetwoproblemsbyproposingaPose-andAttribute-consistentPersonImageSynthesisNetwork (PAC-GAN).Toreduceposeandappearancematchingambiguity,weproposeacomponent-wisetransferring modelconsistingoftwostages.Theformerstagefocusesonlyonsynthesizingtargetposes,whilethelatter renderstargetappearancesbyexplicitlytransferringtheappearanceinformationfromthesourceimageto thetargetimageinacomponent-wisemanner. Inthisway,source-targetmatchingambiguityiseliminated duetothecomponent-wisedisentanglementofposeandappearancesynthesis.Second,tomaintainattribute consistency,werepresenttheinputimageasanattributevectorandimposeahigh-levelsemanticconstraint usingthisvectortoregularizethetargetsynthesis.ExtensiveexperimentalresultsontheDeepFashiondataset demonstratethesuperiorityofourmethodoverthestateoftheart,especiallyformaintainingposeandattributeconsistenciesunderlargeposevariations.


A Multistage Framework For Detection Of Very Small Objects, Duleep Rathgamage Don, Ramazan Aygun, Mahmut Karakaya Jan 2023

A Multistage Framework For Detection Of Very Small Objects, Duleep Rathgamage Don, Ramazan Aygun, Mahmut Karakaya

Published and Grey Literature from PhD Candidates

Small object detection is one of the most challenging problems in computer vision. Algorithms based on state-of-the-art object detection methods such as R-CNN, SSD, FPN, and YOLO fail to detect objects of very small sizes. In this study, we propose a novel method to detect very small objects, smaller than 8×8 pixels, that appear in a complex background. The proposed method is a multistage framework consisting of an unsupervised algorithm and three separately trained supervised algorithms. The unsupervised algorithm extracts ROIs from a high-resolution image. Then the ROIs are upsampled using SRGAN, and the enhanced ROIs are detected by our …


Evolution Of Winning Solutions In The 2021 Low-Power Computer Vision Challenge, Xiao Hu, Ziteng Jiao, Ayden Kocher, Zhenyu Wu, Junjie Liu, James C. Davis, George K. Thiruvathukal, Yung-Hsiang Lu Jan 2023

Evolution Of Winning Solutions In The 2021 Low-Power Computer Vision Challenge, Xiao Hu, Ziteng Jiao, Ayden Kocher, Zhenyu Wu, Junjie Liu, James C. Davis, George K. Thiruvathukal, Yung-Hsiang Lu

Computer Science: Faculty Publications and Other Works

Mobile and embedded devices are becoming ubiquitous. Applications such as rescue with autonomous robots and event analysis on traffic cameras rely on devices with limited power supply and computational sources. Thus, the demand for efficient computer vision algorithms increases. Since 2015, we have organized the IEEE Low-Power Computer Vision Challenge to advance the state of the art in low-power computer vision. We describe the competition organizing details including the challenge design, the reference solution, the dataset, the referee system, and the evolution of the solutions from two winning teams. We examine the winning teams’ development patterns and design decisions, focusing …


Towards A Machine Learning-Based Digital Twin For Non-Invasive Human Bio-Signal Fusion, Izaldein Al-Zyoud, Fedwa Laamarti, Xiaocong Ma, Diana Tobón, Abdulmotaleb Elsaddik Dec 2022

Towards A Machine Learning-Based Digital Twin For Non-Invasive Human Bio-Signal Fusion, Izaldein Al-Zyoud, Fedwa Laamarti, Xiaocong Ma, Diana Tobón, Abdulmotaleb Elsaddik

Computer Vision Faculty Publications

Human bio-signal fusion is considered a critical technological solution that needs to be advanced to enable modern and secure digital health and well-being applications in the metaverse. To support such efforts, we propose a new data-driven digital twin (DT) system to fuse three human physiological bio-signals: heart rate (HR), breathing rate (BR), and blood oxygen saturation level (SpO2). To accomplish this goal, we design a computer vision technology based on the non-invasive photoplethysmography (PPG) technique to extract raw time-series bio-signal data from facial video frames. Then, we implement machine learning (ML) technology to model and measure the bio-signals. We accurately …


Transresnet: Integrating The Strengths Of Vits And Cnns For High Resolution Medical Image Segmentation Via Feature Grafting, Muhammad Hamza Sharif, Dmitry Demidov, Asif Hanif, Mohammad Yaqub, Min Xu Nov 2022

Transresnet: Integrating The Strengths Of Vits And Cnns For High Resolution Medical Image Segmentation Via Feature Grafting, Muhammad Hamza Sharif, Dmitry Demidov, Asif Hanif, Mohammad Yaqub, Min Xu

Computer Vision Faculty Publications

High-resolution images are preferable in medical imaging domain as they significantly improve the diagnostic capability of the underlying method. In particular, high resolution helps substantially in improving automatic image segmentation. However, most of the existing deep learning-based techniques for medical image segmentation are optimized for input images having small spatial dimensions and perform poorly on high-resolution images. To address this shortcoming, we propose a parallel-in-branch architecture called TransResNet, which incorporates Transformer and CNN in a parallel manner to extract features from multi-resolution images independently. In TransResNet, we introduce Cross Grafting Module (CGM), which generates the grafted features, enriched in both …


A Robust Normalizing Flow Using Bernstein-Type Polynomials, Sameera Ramasinghe, Kasun Fernando, Salman Khan, Nick Barnes Nov 2022

A Robust Normalizing Flow Using Bernstein-Type Polynomials, Sameera Ramasinghe, Kasun Fernando, Salman Khan, Nick Barnes

Computer Vision Faculty Publications

Modeling real-world distributions can often be challenging due to sample data that are subjected to perturbations, e.g., instrumentation errors, or added random noise. Since flow models are typically nonlinear algorithms, they amplify these initial errors, leading to poor generalizations. This paper proposes a framework to construct Normalizing Flows (NFs) which demonstrate higher robustness against such initial errors. To this end, we utilize Bernstein-type polynomials inspired by the optimal stability of the Bernstein basis. Further, compared to the existing NF frameworks, our method provides compelling advantages like theoretical upper bounds for the approximation error, better suitability for compactly supported densities, and …


Face Pyramid Vision Transformer, Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood Nov 2022

Face Pyramid Vision Transformer, Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood

Computer Vision Faculty Publications

A novel Face Pyramid Vision Transformer (FPVT) is proposed to learn a discriminative multi-scale facial representations for face recognition and verification. In FPVT, Face Spatial Reduction Attention (FSRA) and Dimensionality Reduction (FDR) layers are employed to make the feature maps compact, thus reducing the computations. An Improved Patch Embedding (IPE) algorithm is proposed to exploit the benefits of CNNs in ViTs (e.g., shared weights, local context, and receptive fields) to model lower-level edges to higher-level semantic primitives. Within FPVT framework, a Convolutional Feed-Forward Network (CFFN) is proposed that extracts locality information to learn low level facial information. The proposed FPVT …


How To Train Vision Transformer On Small-Scale Datasets?, Hanan Gani, Muzammal Naseer, Mohammad Yaqub Nov 2022

How To Train Vision Transformer On Small-Scale Datasets?, Hanan Gani, Muzammal Naseer, Mohammad Yaqub

Computer Vision Faculty Publications

Vision Transformer (ViT), a radically different architecture than convolutional neural networks offers multiple advantages including design simplicity, robustness and state-of-the-art performance on many vision tasks. However, in contrast to convolutional neural networks, Vision Transformer lacks inherent inductive biases. Therefore, successful training of such models is mainly attributed to pre-training on large-scale datasets such as ImageNet with 1.2M or JFT with 300M images. This hinders the direct adaption of Vision Transformer for small-scale datasets. In this work, we show that self-supervised inductive biases can be learned directly from small-scale datasets and serve as an effective weight initialization scheme for fine-tuning. This …


Maximum Spatial Perturbation Consistency For Unpaired Image-To-Image Translation, Yanwu Xu, Shaoan Xie, Wenhao Wu, Kun Zhang, Mingming Gong, Kayhan Batmanghelich Sep 2022

Maximum Spatial Perturbation Consistency For Unpaired Image-To-Image Translation, Yanwu Xu, Shaoan Xie, Wenhao Wu, Kun Zhang, Mingming Gong, Kayhan Batmanghelich

Machine Learning Faculty Publications

Unpaired image-to-image translation (I2I) is an ill-posed problem, as an infinite number of translation functions can map the source domain distribution to the target distribution. Therefore, much effort has been put into designing suitable constraints, e.g., cycle consistency (CycleGAN), geometry consistency (GCGAN), and contrastive learning-based constraints (CUTGAN), that help better pose the problem. However, these well-known constraints have limitations: (1) they are either too restrictive or too weak for specific I2I tasks; (2) these methods result in content distortion when there is a significant spatial variation between the source and target domains. This paper proposes a universal regularization technique called …


Towards The Development Of A Cost-Effective Image-Sensing-Smart-Parking Systems (Isensmap), Aakriti Sharma Aug 2022

Towards The Development Of A Cost-Effective Image-Sensing-Smart-Parking Systems (Isensmap), Aakriti Sharma

Electronic Thesis and Dissertation Repository

Finding parking in a busy city has been a major daily problem in today’s busy life. Researchers have proposed various parking spot detection systems to overcome the problem of spending a long time searching for a parking spot. These works include a wide variety of sensors to detect the presence of a vehicle in a parking spot. These approaches are expensive to implement and ineffective in extreme weather conditions in an outdoor parking environment. As a result, a cost-effective, dependable, and time-saving parking solution is much more desirable. In this thesis, we proposed and developed an image processing-based real-time parking-spot …


Avist: A Benchmark For Visual Object Tracking In Adverse Visibility, Mubashir Noman, Wafa Al Ghallabi, Daniya Najiha, Christoph Mayer, Hisham Cholakkal, Salman Khan, Luc Van Gool, Fahad Shahbaz Khan Aug 2022

Avist: A Benchmark For Visual Object Tracking In Adverse Visibility, Mubashir Noman, Wafa Al Ghallabi, Daniya Najiha, Christoph Mayer, Hisham Cholakkal, Salman Khan, Luc Van Gool, Fahad Shahbaz Khan

Computer Vision Faculty Publications

One of the key factors behind the recent success in visual tracking is the availability of dedicated benchmarks. While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects. We introduce AVisT, a dedicated benchmark for visual tracking in diverse scenarios with adverse visibility. AVisT comprises 120 challenging sequences with 80k annotated frames, spanning 18 diverse scenarios …


3d Vision With Transformers: A Survey, Jean Lahoud, Jiale Cao, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Anwer, Salman Khan, Ming-Hsuan Yang Aug 2022

3d Vision With Transformers: A Survey, Jean Lahoud, Jiale Cao, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Anwer, Salman Khan, Ming-Hsuan Yang

Computer Vision Faculty Publications

The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field. The transformer has been used as a replacement for the widely used convolution operators, due to its ability to learn long-range dependencies. This replacement was proven to be successful in numerous tasks, in which several state-of-the-art methods rely on transformers for better learning. In computer vision, the 3D field has also witnessed an increase in employing the transformer for 3D convolution neural networks and multi-layer perceptron networks. Although a number of surveys have focused on transformers in vision in general, 3D …


Computer Aided Diagnosis System For Breast Cancer Using Deep Learning., Asma Baccouche Aug 2022

Computer Aided Diagnosis System For Breast Cancer Using Deep Learning., Asma Baccouche

Electronic Theses and Dissertations

The recent rise of big data technology surrounding the electronic systems and developed toolkits gave birth to new promises for Artificial Intelligence (AI). With the continuous use of data-centric systems and machines in our lives, such as social media, surveys, emails, reports, etc., there is no doubt that data has gained the center of attention by scientists and motivated them to provide more decision-making and operational support systems across multiple domains. With the recent breakthroughs in artificial intelligence, the use of machine learning and deep learning models have achieved remarkable advances in computer vision, ecommerce, cybersecurity, and healthcare. Particularly, numerous …


Directed Acyclic Graph-Based Neural Networks For Tunable Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Xiao Hu, George K. Thiruvathukal, James C. Davis, Yung-Hisang Lu Aug 2022

Directed Acyclic Graph-Based Neural Networks For Tunable Low-Power Computer Vision, Abhinav Goel, Caleb Tung, Nick Eliopoulos, Xiao Hu, George K. Thiruvathukal, James C. Davis, Yung-Hisang Lu

Computer Science: Faculty Publications and Other Works

Processing visual data on mobile devices has many applications, e.g., emergency response and tracking. State-of-the-art computer vision techniques rely on large Deep Neural Networks (DNNs) that are usually too power-hungry to be deployed on resource-constrained edge devices. Many techniques improve DNN efficiency of DNNs by compromising accuracy. However, the accuracy and efficiency of these techniques cannot be adapted for diverse edge applications with different hardware constraints and accuracy requirements. This paper demonstrates that a recent, efficient tree-based DNN architecture, called the hierarchical DNN, can be converted into a Directed Acyclic Graph-based (DAG) architecture to provide tunable accuracy-efficiency tradeoff options. We …