Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Physical Sciences and Mathematics

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari Jan 2024

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari

Computer Science Faculty Publications

Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental …


Object Recognition With Deep Neural Networks In Low-End Systems, Lillian Davis Oct 2023

Object Recognition With Deep Neural Networks In Low-End Systems, Lillian Davis

Mahurin Honors College Capstone Experience/Thesis Projects

Object recognition is an important area in computer vision. Object recognition has been advanced significantly by deep learning that unifies feature extraction and classification. In general, deep neural networks, such as Convolution Neural Networks (CNNs), are trained in high-performance systems. Aiming to extend the reach of deep learning to personal computing, I propose a study of deep learning-based object recognition in low-end systems, such as laptops. This research includes how differing layer configurations and hyperparameter values used in CNNs can either create or resolve the issue of overfitting and affect final accuracy levels of object recognition systems. The main contribution …


Cmr3d: Contextualized Multi-Stage Refinement For 3d Object Detection, Dhanalaxmi Gaddam, Jean Lahoud, Fahad Shahbaz Khan, Rao Anwer, Hisham Cholakkal Sep 2022

Cmr3d: Contextualized Multi-Stage Refinement For 3d Object Detection, Dhanalaxmi Gaddam, Jean Lahoud, Fahad Shahbaz Khan, Rao Anwer, Hisham Cholakkal

Computer Vision Faculty Publications

Existing deep learning-based 3D object detectors typically rely on the appearance of individual objects and do not explicitly pay attention to the rich contextual information of the scene. In this work, we propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework, which takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene at multiple levels to predict a set of object bounding-boxes along with their corresponding semantic labels. To this end, we propose to utilize a context enhancement network that captures the contextual information at different levels of granularity followed by a …


Pseudo-Stereo For Monocular 3d Object Detection In Autonomous Driving, Yi-Nan Chen, Hang Dai, Yong Ding Mar 2022

Pseudo-Stereo For Monocular 3d Object Detection In Autonomous Driving, Yi-Nan Chen, Hang Dai, Yong Ding

Computer Vision Faculty Publications

Pseudo-LiDAR 3D detectors have made remarkable progress in monocular 3D detection by enhancing the capability of perceiving depth with depth estimation networks, and using LiDAR-based 3D detection architectures. The advanced stereo 3D detectors can also accurately localize 3D objects. The gap in image-to-image generation for stereo views is much smaller than that in image-to-LiDAR generation. Motivated by this, we propose a Pseudo-Stereo 3D detection framework with three novel virtual view generation methods, including image-level generation, feature-level generation, and feature-clone, for detecting 3D objects from a single image. Our analysis of depth-aware learning shows that the depth loss is effective in …


Ow-Detr: Open-World Detection Transformer, Akshita Gupta, Sanath Narayan, K.J. Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah Dec 2021

Ow-Detr: Open-World Detection Transformer, Akshita Gupta, Sanath Narayan, K.J. Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

Computer Vision Faculty Publications

Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on potentially unknown objects, separating the unknown objects from the background and detecting diverse unknown objects. Here, we introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection. The proposed OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification …


Multi-Modal Transformers Excel At Class-Agnostic Object Detection, Muhammad Maaz, Hanoona Bangalath Rasheed, Salman Hameed Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang Nov 2021

Multi-Modal Transformers Excel At Class-Agnostic Object Detection, Muhammad Maaz, Hanoona Bangalath Rasheed, Salman Hameed Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang

Computer Vision Faculty Publications

What constitutes an object? This has been a longstanding question in computer vision. Towards this goal, numerous learning-free and learning-based approaches have been developed to score objectness. However, they generally do not scale well across new domains and for unseen objects. In this paper, we advocate that existing methods lack a top-down supervision signal governed by human-understandable semantics. To bridge this gap, we explore recent Multi-modal Vision Transformers (MViT) that have been trained with aligned image-text pairs. Our extensive experiments across various domains and novel objects show the state-of-the-art performance of MViTs to localize generic objects in images. Based on …


Improving Visual Recognition With Unlabeled Data, Aruni Roy Chowdhury Jul 2020

Improving Visual Recognition With Unlabeled Data, Aruni Roy Chowdhury

Doctoral Dissertations

The success of deep neural networks has resulted in computer vision systems that obtain high accuracy on a wide variety of tasks such as image classification, object detection, semantic segmentation, etc. However, most state-of-the-art vision systems are dependent upon large amounts of labeled training data, which is not a scalable solution in the long run. This work focuses on improving existing models for visual object recognition and detection without being dependent on such large-scale human-annotated data. We first show how large numbers of hard examples (cases where an existing model makes a mistake) can be obtained automatically from unlabeled video …


Computational Modeling Of Trust Factors Using Reinforcement Learning, C. M. Kuzio, A. Dinh, C. Stone, L. Vidyaratne, K. M. Iftekharuddin Jan 2019

Computational Modeling Of Trust Factors Using Reinforcement Learning, C. M. Kuzio, A. Dinh, C. Stone, L. Vidyaratne, K. M. Iftekharuddin

Electrical & Computer Engineering Faculty Publications

As machine-learning algorithms continue to expand their scope and approach more ambiguous goals, they may be required to make decisions based on data that is often incomplete, imprecise, and uncertain. The capabilities of these models must, in turn, evolve to meet the increasingly complex challenges associated with the deployment and integration of intelligent systems into modern society. Historical variability in the performance of traditional machine-learning models in dynamic environments leads to ambiguity of trust in decisions made by such algorithms. Consequently, the objective of this work is to develop a novel computational model that effectively quantifies the reliability of autonomous …


Large-Scale Discovery Of Visual Features For Object Recognition, Drew Linsley, Sven Eberhardt, Dan Shiebler, Thomas Serre May 2017

Large-Scale Discovery Of Visual Features For Object Recognition, Drew Linsley, Sven Eberhardt, Dan Shiebler, Thomas Serre

MODVIS Workshop

A central goal in vision science is to identify features that are important for object and scene recognition. Reverse correlation methods have been used to uncover features important for recognizing faces and other stimuli with low intra-class variability. However, these methods are less successful when applied to natural scenes with variability in their appearance.

To rectify this, we developed Clicktionary, a web-based game for identifying features for recognizing real-world objects. Pairs of participants play together in different roles to identify objects: A “teacher” reveals image regions diagnostic of the object’s category while a “student” tries to recognize the object. Aggregating …


An Exercise And Sports Equipment Recognition System, Siddarth Kalra May 2016

An Exercise And Sports Equipment Recognition System, Siddarth Kalra

Electronic Thesis and Dissertation Repository

Most mobile health management applications today require manual input or use sensors like the accelerometer or GPS to record user data. The onboard camera remains underused. We propose an Exercise and Sports Equipment Recognition System (ESRS) that can recognize physical activity equipment from raw image data. This system can be integrated with mobile phones to allow the camera to become a primary input device for recording physical activity. We employ a deep convolutional neural network to train models capable of recognizing 14 different equipment categories. Furthermore, we propose a preprocessing scheme that uses color normalization and denoising techniques to improve …


Object Recognition And Visual Search With A Physiologically Grounded Model Of Visual Attention, Frederik Beuth, Fred H. Hamker May 2015

Object Recognition And Visual Search With A Physiologically Grounded Model Of Visual Attention, Frederik Beuth, Fred H. Hamker

MODVIS Workshop

Visual attention models can explain a rich set of physiological data (Reynolds & Heeger, 2009, Neuron), but can rarely link these findings to real-world tasks. Here, we would like to narrow this gap with a novel, physiologically grounded model of visual attention by demonstrating its objects recognition abilities in noisy scenes.

To base the model on physiological data, we used a recently developed microcircuit model of visual attention (Beuth & Hamker, in revision, Vision Res) which explains a large set of attention experiments, e.g. biased competition, modulation of contrast response functions, tuning curves, and surround suppression. Objects are represented by …


Object Detection And Classification With Applications To Skin Cancer Screening, Jonathan Blackledge, Dmitryi Dubovitskiy Jan 2008

Object Detection And Classification With Applications To Skin Cancer Screening, Jonathan Blackledge, Dmitryi Dubovitskiy

Articles

This paper discusses a new approach to the processes of object detection, recognition and classification in a digital image. The classification method is based on the application of a set of features which include fractal parameters such as the Lacunarity and Fractal Dimension. Thus, the approach used, incorporates the characterisation of an object in terms of its texture.

The principal issues associated with object recognition are presented which includes two novel fast segmentation algorithms for which C++ code is provided. The self-learning procedure for designing a decision making engine using fuzzy logic and membership function theory is also presented and …