Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Series

Computer Vision

Discipline
Institution
Publication Year
Publication
File Type

Articles 1 - 27 of 27

Full-Text Articles in Physical Sciences and Mathematics

Emotion-Aware Music Recommendation, Hieu Tran, Tuan Le, Anh Do, Tram Vu, Steven Bogaerts, Brian T. Howard Sep 2023

Emotion-Aware Music Recommendation, Hieu Tran, Tuan Le, Anh Do, Tram Vu, Steven Bogaerts, Brian T. Howard

Computer Science Faculty publications

It is common to listen to songs that match one's mood. Thus, an AI music recommendation system that is aware of the user's emotions is likely to provide a superior user experience to one that is unaware. In this paper, we present an emotion-aware music recommendation system. Multiple models are discussed and evaluated for affect identification from a live image of the user. We propose two models: DRViT, which applies dynamic routing to vision transformers, and InvNet50, which uses involution. All considered models are trained and evaluated on the AffectNet dataset. Each model outputs the user's estimated valence and arousal …


Decoding The Underlying Meaning Of Multimodal Hateful Memes, Ming Shan Hee, Wen Haw Chong, Roy Ka-Wei Lee Aug 2023

Decoding The Underlying Meaning Of Multimodal Hateful Memes, Ming Shan Hee, Wen Haw Chong, Roy Ka-Wei Lee

Research Collection School Of Computing and Information Systems

Recent studies have proposed models that yielded promising performance for the hateful meme classification task. Nevertheless, these proposed models do not generate interpretable explanations that uncover the underlying meaning and support the classification output. A major reason for the lack of explainable hateful meme methods is the absence of a hateful meme dataset that contains ground truth explanations for benchmarking or training. Intuitively, having such explanations can educate and assist content moderators in interpreting and removing flagged hateful memes. This paper address this research gap by introducing Hateful meme with Reasons Dataset (HatReD), which is a new multimodal hateful meme …


Automatic Identification Of Jetting Behavior In 3d Printing With Binary Classification And Anomaly Detection, Alexander Chandy May 2023

Automatic Identification Of Jetting Behavior In 3d Printing With Binary Classification And Anomaly Detection, Alexander Chandy

Honors Scholar Theses

Consistently jetting different materials from the print head of a 3D printer is a key, yet challenging task in manufacturing processes. By using active machine learning, we can efficiently predict complex diagrams that illustrate the region of printing conditions under which “desirable jetting”, “jetting”, and “no jetting” of ink occurs for different substances. However, labeling the images of printed ink droplets that are fed to the active learning model can be time intensive. Therefore, it is ideal to use computer vision to automate the classification of this image data. This classification can be broken down into two steps. In the …


Softskip: Empowering Multi-Modal Dynamic Pruning For Single-Stage Referring Comprehension, Dulanga Weerakoon, Vigneshwaran Subbaraju, Tuan Tran, Archan Misra Oct 2022

Softskip: Empowering Multi-Modal Dynamic Pruning For Single-Stage Referring Comprehension, Dulanga Weerakoon, Vigneshwaran Subbaraju, Tuan Tran, Archan Misra

Research Collection School Of Computing and Information Systems

Supporting real-time referring expression comprehension (REC) on pervasive devices is an important capability for human-AI collaborative tasks. Model pruning techniques, applied to DNN models, can enable real-time execution even on resource-constrained devices. However, existing pruning strategies are designed principally for uni-modal applications, and suffer a significant loss of accuracy when applied to REC tasks that require fusion of textual and visual inputs. We thus present a multi-modal pruning model, LGMDP, which uses language as a pivot to dynamically and judiciously select the relevant computational blocks that need to be executed. LGMDP also introduces a new SoftSkip mechanism, whereby 'skipped' visual …


Semantic-Aligned Matching For Enhanced Detr Convergence And Multi-Scale Feature Fusion, Gongjie Zhang, Zhipeng Luo, Yingchen Yu, Jiaxing Huang, Kaiwen Cui, Shijian Lu, Eric Xing Jul 2022

Semantic-Aligned Matching For Enhanced Detr Convergence And Multi-Scale Feature Fusion, Gongjie Zhang, Zhipeng Luo, Yingchen Yu, Jiaxing Huang, Kaiwen Cui, Shijian Lu, Eric Xing

Machine Learning Faculty Publications

The recently proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection. However, DETR suffers from slow training convergence, which hinders its applicability to various detection tasks. We observe that DETR's slow convergence is largely attributed to the difficulty in matching object queries to relevant regions due to the unaligned semantics between object queries and encoded image features. With this observation, we design Semantic-Aligned-Matching DETR++ (SAM-DETR++) to accelerate DETR's convergence and improve detection performance. The core of SAM-DETR++ is a plug-andplay module that projects object queries and encoded image features into the same feature embedding space, where …


Amplification Of Hidden Periodic Motions In 3d Videos, Thomas Boccuto, Seraiah Kutai, Kristen Mosley, Samuel Kirk Jul 2021

Amplification Of Hidden Periodic Motions In 3d Videos, Thomas Boccuto, Seraiah Kutai, Kristen Mosley, Samuel Kirk

Mathematics Summer Fellows

Ordinary videos capture a surprising amount of hidden, visually imperceptible information. For instance, videos of peoples' faces may capture color changes in the skin and artery motion from heartbeats, while videos of mechanical systems can capture subtle vibrations indicating imminent failure. Algorithms can extract and exaggerate these signals for visualization on top of the original videos. In particular, Eulerian magnification algorithms sidestep the need to track hidden motions directly and instead devise multiscale bandpass filters to amplify signals in local spatial regions. In this work, we extend these techniques beyond color videos to geometric video data captured by 3D depth …


Model Uncertainty Guides Visual Object Tracking, Lijun Zhou, Antoine Ledent, Qintao Hu, Ting Liu, Jianlin Zhang, Marius Kloft Feb 2021

Model Uncertainty Guides Visual Object Tracking, Lijun Zhou, Antoine Ledent, Qintao Hu, Ting Liu, Jianlin Zhang, Marius Kloft

Research Collection School Of Computing and Information Systems

Model object trackers largely rely on the online learning of a discriminative classifier from potentially diverse sample frames. However, noisy or insufficient amounts of samples can deteriorate the classifiers' performance and cause tracking drift. Furthermore, alterations such as occlusion and blurring can cause the target to be lost. In this paper, we make several improvements aimed at tackling uncertainty and improving robustness in object tracking. Our first and most important contribution is to propose a sampling method for the online learning of object trackers based on uncertainty adjustment: our method effectively selects representative sample frames to feed the discriminative branch …


Adversarial Reconstruction Loss For Domain Generalization, Bekkouch Imad Eddine Ibrahim, Dragos Constantin Nicolae, Adil Khan, S. M. Ahsan Kazmi, Asad Masood Khattak, Bulat Ibragimov Jan 2021

Adversarial Reconstruction Loss For Domain Generalization, Bekkouch Imad Eddine Ibrahim, Dragos Constantin Nicolae, Adil Khan, S. M. Ahsan Kazmi, Asad Masood Khattak, Bulat Ibragimov

All Works

The biggest fear when deploying machine learning models to the real world is their ability to handle the new data. This problem is significant especially in medicine, where models trained on rich high-quality data extracted from large hospitals do not scale to small regional hospitals. One of the clinical challenges addressed in this work is magnetic resonance image generalization for improved visualization and diagnosis of hip abnormalities such as femoroacetabular impingement and dysplasia. Domain Generalization (DG) is a field in machine learning that tries to solve the model’s dependency on the training data by leveraging many related but different data …


Pothole Detection Under Diverse Conditions Using Object Detection Model, Ibrahim Hassan Syed, Dympna O'Sullivan, Susan Mckeever Jan 2021

Pothole Detection Under Diverse Conditions Using Object Detection Model, Ibrahim Hassan Syed, Dympna O'Sullivan, Susan Mckeever

Datasets

One of the most important tasks in road maintenance is the detection of potholes. This process is usually done through manual visual inspection, where certified engineers assess recorded images of pavements acquired using cameras or professional road assessment vehicles. Machine learning techniques are now being applied to this problem, with models trained to automatically identify road conditions. However, approaching this real-world problem with machine learning techniques presents the classic problem of how to produce generalizable models. Images and videos may be captured in different illumination conditions, with different camera types, camera angles and resolutions. In this paper we present our …


Multi-Branch Gabor Wavelet Layers For Pedestrian Attribute Recognition, Imran N. Junejo Jan 2021

Multi-Branch Gabor Wavelet Layers For Pedestrian Attribute Recognition, Imran N. Junejo

All Works

CCBYNCND Surveillance cameras are everywhere, keeping an eye on pedestrians as they navigate through a scene. With this context, our paper addresses the problem of pedestrian attribute recognition (PAR). This problem entails recognizing attributes such as age-group, clothing style, accessories, footwear style etc. This is a multi-label problem and challenging even for human observers. The problem has rightly attracted attention recently from the computer vision community. In this paper, we adopt trainable Gabor wavelets (TGW) layers and use it with a convolution neural network (CNN). Whereas other researchers are using fixed Gabor filters with the CNN, the proposed layers are …


Camera Placement Meeting Restrictions Of Computer Vision, Sara Aghajanzadeh, Roopasree Naidu, Shuo-Han Chen, Caleb Tung, Abhinav Goel, Yung-Hsiang Lu, George K. Thiruvathukal Oct 2020

Camera Placement Meeting Restrictions Of Computer Vision, Sara Aghajanzadeh, Roopasree Naidu, Shuo-Han Chen, Caleb Tung, Abhinav Goel, Yung-Hsiang Lu, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

In the blooming era of smart edge devices, surveillance cam- eras have been deployed in many locations. Surveillance cam- eras are most useful when they are spaced out to maximize coverage of an area. However, deciding where to place cam- eras is an NP-hard problem and researchers have proposed heuristic solutions. Existing work does not consider a signifi- cant restriction of computer vision: in order to track a moving object, the object must occupy enough pixels. The number of pixels depends on many factors (how far away is the object? What is the camera resolution? What is the focal length?). …


A New Ectotherm 3d Tracking And Behavior Analytics System Using A Depth-Based Approach With Color Validation, With Preliminary Data On Kihansi Spray Toad (Nectophrynoides Asperginis) Activity, Philip Bal, Damian Lyons, Avishai Shuter Mar 2020

A New Ectotherm 3d Tracking And Behavior Analytics System Using A Depth-Based Approach With Color Validation, With Preliminary Data On Kihansi Spray Toad (Nectophrynoides Asperginis) Activity, Philip Bal, Damian Lyons, Avishai Shuter

Faculty Publications

The Kihansi spray toad (Nectophrynoides asperginis), classified as Extinct in the Wild by the IUCN, is being bred at the Wildlife Conservation Society’s (WCS) Bronx Zoo as part of an effort to successfully reintroduce the species into the wild. Thousands of toads live at the Bronx Zoo presenting an opportunity to learn more about their behaviors for the first time, at scale. It is impractical to perform manual observations for long periods of time. This paper reports on the development of a RGB-D tracking and analytics approach that allows researchers to accurately and efficiently gather information about the toads’ behavior. …


A Multi-Branch Separable Convolution Neural Network For Pedestrian Attribute Recognition, Imran N. Junejo, Naveed Ahmed Mar 2020

A Multi-Branch Separable Convolution Neural Network For Pedestrian Attribute Recognition, Imran N. Junejo, Naveed Ahmed

All Works

© 2020 The Authors Computer science; Computer Vision; Image processing; Deep learning; Pedestrian attribute recognition


Development Of An Autonomous Aerial Toolset For Agricultural Applications, Terrance Life Oct 2019

Development Of An Autonomous Aerial Toolset For Agricultural Applications, Terrance Life

Mahurin Honors College Capstone Experience/Thesis Projects

According to the United Nations, the world population is expected to grow from its current 7 billion to 9.7 billion by the year 2050. During this time, global food demand is also expected to increase by between 59% and 98% due to the population increase, accompanied by an increasing demand for protein due to a rising standard of living throughout developing countries. [1] Meeting this increase in required food production using present agricultural practices would necessitate a similar increase in farmland; a resource which does not exist in abundance. Therefore, in order to meet growing food demands, new methods will …


Cloud Resource Optimization For Processing Multiple Streams Of Visual Data, Zohar Kapach, Andrew Ulmer, Daniel Merrick, Arshad Alikhan, Yung-Hsiang Lu, Anup Mohan, Ahmed S. Kaseb, George K. Thiruvathukal Jan 2019

Cloud Resource Optimization For Processing Multiple Streams Of Visual Data, Zohar Kapach, Andrew Ulmer, Daniel Merrick, Arshad Alikhan, Yung-Hsiang Lu, Anup Mohan, Ahmed S. Kaseb, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

Hundreds of millions of network cameras have been installed throughout the world. Each is capable of providing a vast amount of real-time data. Analyzing the massive data generated by these cameras requires significant computational resources and the demands may vary over time. Cloud computing shows the most promise to provide the needed resources on demand. In this article, we investigate how to allocate cloud resources when analyzing real-time data streams from network cameras. A resource manager considers many factors that affect its decisions, including the types of analysis, the number of data streams, and the locations of the cameras. The …


Generating Diverse And Meaningful Captions: Unsupervised Specificity Optimization For Image Captioning, Annika Lindh, Robert J. Ross, Abhijit Mahalunkar, Giancarlo Salton, John D. Kelleher Jan 2018

Generating Diverse And Meaningful Captions: Unsupervised Specificity Optimization For Image Captioning, Annika Lindh, Robert J. Ross, Abhijit Mahalunkar, Giancarlo Salton, John D. Kelleher

Conference papers

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty.

We make our …


An Approach To Robust Homing With Stereovision, Fuqiang Fu, Damian Lyons Apr 2017

An Approach To Robust Homing With Stereovision, Fuqiang Fu, Damian Lyons

Faculty Publications

Visual Homing is a bioinspired approach to robot navigation which can be fast and uses few assumptions. However, visual homing in a cluttered and unstructured outdoor environment offers several challenges to homing methods that have been developed for primarily indoor environments. One issue is that any current image during homing may be tilted with respect to the home image. The second is that moving through a cluttered scene during homing may cause obstacles to interfere between the home scene and location and the current scene and location. In this paper, we introduce a robust method to improve a previous developed …


Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh Dec 2016

Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh

Conference papers

Accurate classification of astronomical objects currently relies on spectroscopic data. Acquiring this data is time-consuming and expensive compared to photometric data. Hence, improving the accuracy of photometric classification could lead to far better coverage and faster classification pipelines. This paper investigates the benefit of using unsupervised feature-extraction from multi-wavelength image data for photometric classification of stars, galaxies and QSOs. An unsupervised Deep Belief Network is used, giving the model a higher level of interpretability thanks to its generative nature and layer-wise training. A Random Forest classifier is used to measure the contribution of the novel features compared to a set …


Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh Sep 2016

Investigating The Impact Of Unsupervised Feature-Extraction From Multi-Wavelength Image Data For Photometric Classification Of Stars, Galaxies And Qsos, Annika Lindh

Dissertations

This thesis reviews the current state of photometric classification in Astronomy and identifies two main gaps: a dependence on handcrafted rules, and a lack of interpretability in the more successful classifiers. To address this, Deep Learning and Computer Vision were used to create a more interpretable model, using unsupervised training to reduce human bias.

The main contribution is the investigation into the impact of using unsupervised feature-extraction from multi-wavelength image data for the classification task. The feature-extraction is achieved by implementing an unsupervised Deep Belief Network to extract lower-dimensionality features from the multi-wavelength image data captured by the Sloan Digital …


Pedestrian Detection Using Basic Polyline: A Geometric Framework For Pedestrian Detection, Liang Gongbo Apr 2016

Pedestrian Detection Using Basic Polyline: A Geometric Framework For Pedestrian Detection, Liang Gongbo

Masters Theses & Specialist Projects

Pedestrian detection has been an active research area for computer vision in recently years. It has many applications that could improve our lives, such as video surveillance security, auto-driving assistance systems, etc. The approaches of pedestrian detection could be roughly categorized into two categories, shape-based approaches and appearance-based approaches. In the literature, most of approaches are appearance-based. Shape-based approaches are usually integrated with an appearance-based approach to speed up a detection process.

In this thesis, I propose a shape-based pedestrian detection framework using the geometric features of human to detect pedestrians. This framework includes three main steps. Give a static …


Real-Time Supervised Detection Of Pink Areas In Dermoscopic Images Of Melanoma: Importance Of Color Shades, Texture And Location, Ravneet Kaur, P. P. Albano, Justin G. Cole, Jason R. Hagerty, Robert W. Leander, Randy Hays Moss, William V. Stoecker Nov 2015

Real-Time Supervised Detection Of Pink Areas In Dermoscopic Images Of Melanoma: Importance Of Color Shades, Texture And Location, Ravneet Kaur, P. P. Albano, Justin G. Cole, Jason R. Hagerty, Robert W. Leander, Randy Hays Moss, William V. Stoecker

Electrical and Computer Engineering Faculty Research & Creative Works

Background/Purpose: Early detection of malignant melanoma is an important public health challenge. In the USA, dermatologists are seeing more melanomas at an early stage, before classic melanoma features have become apparent. Pink color is a feature of these early melanomas. If rapid and accurate automatic detection of pink color in these melanomas could be accomplished, there could be significant public health benefits.

Methods: Detection of three shades of pink (light pink, dark pink, and orange pink) was accomplished using color analysis techniques in five color planes (red, green, blue, hue, and saturation). Color shade analysis was performed using a logistic …


Automated Extraction Of Structures From Sketches Of Biological Specimens, Jamie J. Schirf May 2010

Automated Extraction Of Structures From Sketches Of Biological Specimens, Jamie J. Schirf

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The goal of this study was to develop automated techniques to extract biological structures from sketches of biological specimens. This will form the basis for a searchable database of information about the specimens. Having such a database enables researchers to efficiently search for specimens with particular qualities or identify unknown specimens.

After some preprocessing of the images, the important internal organs of the specimen are extracted using image analysis techniques. The shape, size, and organization of the organs are used to categorize and then to reorganize them in the image. Results using a large database of sketches of trematodes, in …


Autonomous Geometric Precision Error Estimation In Low-Level Computer Vision Tasks, Andrés Corrada-Emmanuel, Howard Schultz Jul 2008

Autonomous Geometric Precision Error Estimation In Low-Level Computer Vision Tasks, Andrés Corrada-Emmanuel, Howard Schultz

Computer Science Department Faculty Publication Series

Errors in map-making tasks using computer vision are sparse. We demonstrate this by considering the construction of digital elevation models that employ stereo matching algorithms to triangulate real-world points. This sparsity, coupled with a geometric theory of errors recently developed by the authors, allows for autonomous agents to calculate their own precision independently of ground truth. We connect these developments with recent advances in the mathematics of sparse signal reconstruction or compressed sensing. The theory presented here extends the autonomy of 3-D model reconstructions discovered in the 1990s to their errors.


Autonomous Estimates Of Horizontal Decorrelation Lengths For Digital Elevation Models, Andres Corrada-Emmanuel, Howard Schultz Jan 2008

Autonomous Estimates Of Horizontal Decorrelation Lengths For Digital Elevation Models, Andres Corrada-Emmanuel, Howard Schultz

Computer Science Department Faculty Publication Series

The precision errors in a collection of digital elevation models (DEMs) can be estimated in the presence of large but sparse correlations even when no ground truth is known. We demonstrate this by considering the problem of how to estimate the horizontal decorrelation length of DEMs produced by an automatic photogrammetric process that relies on the epipolar constraint equations. The procedure is based on a set of autonomous elevation difference equations recently proposed by us. In this paper we show that these equations can only estimate the precision errors of DEMs. The accuracy errors are unknowable since there is no …


Melanoma And Seborrheic Keratosis Differentiation Using Texture Features, Srinivas V. Deshabhoina, Scott E. Umbaugh, William V. Stoecker, Randy Hays Moss, Subhashini K. Srinivasan Nov 2003

Melanoma And Seborrheic Keratosis Differentiation Using Texture Features, Srinivas V. Deshabhoina, Scott E. Umbaugh, William V. Stoecker, Randy Hays Moss, Subhashini K. Srinivasan

Chemistry Faculty Research & Creative Works

Purpose: To explore texture features in two-dimensional images to differentiate seborrheic keratosis from melanoma.

Methods: A systematic approach to consistent classification of skin tumors is described. Texture features, based on the second-order histogram, were used to identify the features or a combination of features that could consistently differentiate a malignant skin tumor (melanoma) from a benign one (seborrheic keratosis). Two hundred and seventy-one skin tumor images were separated into training and test sets for accuracy and consistency. Automatic induction was applied to generate classification rules. Data analysis and modeling tools were used to gain further insight into the feature space. …


Automated Alignment By Hybrid Video And 3-D Video Moiré With Both Conventional And Parallel Processing, Barbara Garita, Hector Gutierrez, Ildiko Laszlo, Joel H. Blatt Jan 2001

Automated Alignment By Hybrid Video And 3-D Video Moiré With Both Conventional And Parallel Processing, Barbara Garita, Hector Gutierrez, Ildiko Laszlo, Joel H. Blatt

Aerospace, Physics, and Space Science Faculty Publications

A problem common to automated assembly in manufacturing or in automated docking of spacecraft is angular and lateral alignment of components. A hybrid video system utilizing both conventional imaging and 3-D video moiré has been developed to automatically align a test target with three translational and two rotational degrees of freedom. Alignment was demonstrated via computer controlled translation and rotation stages. The video moiré system is operated in an error map mode, in which a structurally illuminated reference surface is used to chromakey the image of an identical structurally illuminated alignment target. The output is a moiré image generated by …


Automatic Pcb Inspection Systems, M. Moganti, Fikret Erçal Jan 1995

Automatic Pcb Inspection Systems, M. Moganti, Fikret Erçal

Computer Science Faculty Research & Creative Works

There are more than 50 process steps required to fabricate a printed circuit board (PCB). To ensure quality, human operators simply inspect the work visually against prescribed standards. The decisions made by this labor intensive, and therefore costly, procedure often also involve subjective judgements. Automatic inspection systems remove the subjective aspects and provide fast, quantitative dimensional assessments. Machine vision may answer the manufacturing industry's need to improve product quality and increase productivity. The major limitation of existing inspection systems is that all the algorithms need a special hardware platform to achieve the desired real-time speeds. This makes the systems extremely …