Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 24 of 24

Full-Text Articles in Physical Sciences and Mathematics

Urban Flood Extent Segmentation And Evaluation From Real-World Surveillance Camera Images Using Deep Convolutional Neural Network, Yidi Wang, Yawen Shen, Behrouz Salahshour, Mecit Cetin, Khan Iftekharuddin, Navid Tahvildari, Guoping Huang, Devin K. Harris, Kwame Ampofo, Jonathan L. Goodall Jan 2024

Urban Flood Extent Segmentation And Evaluation From Real-World Surveillance Camera Images Using Deep Convolutional Neural Network, Yidi Wang, Yawen Shen, Behrouz Salahshour, Mecit Cetin, Khan Iftekharuddin, Navid Tahvildari, Guoping Huang, Devin K. Harris, Kwame Ampofo, Jonathan L. Goodall

Civil & Environmental Engineering Faculty Publications

This study explores the use of Deep Convolutional Neural Network (DCNN) for semantic segmentation of flood images. Imagery datasets of urban flooding were used to train two DCNN-based models, and camera images were used to test the application of the models with real-world data. Validation results show that both models extracted flood extent with a mean F1-score over 0.9. The factors that affected the performance included still water surface with specular reflection, wet road surface, and low illumination. In testing, reduced visibility during a storm and raindrops on surveillance cameras were major problems that affected the segmentation of flood extent. …


Team Thesyllogist At Semeval-2023 Task 3: Language-Agnostic Framing Detection In Multi-Lingual Online News: A Zero-Shot Transfer Approach, Osama Mohammed Afzal, Preslav Nakov Jul 2023

Team Thesyllogist At Semeval-2023 Task 3: Language-Agnostic Framing Detection In Multi-Lingual Online News: A Zero-Shot Transfer Approach, Osama Mohammed Afzal, Preslav Nakov

Natural Language Processing Faculty Publications

We describe our system for SemEval-2022 Task 3 subtask 2 which on detecting the frames used in a news article in a multi-lingual setup. We propose a multi-lingual approach based on machine translation of the input, followed by an English prediction model. Our system demonstrated good zero-shot transfer capability, achieving micro-F1 scores of 53% for Greek (4th on the leaderboard) and 56.1% for Georgian (3rd on the leaderboard), without any prior training on translated data for these languages. Moreover, our system achieved comparable performance on seven other languages, including German, English, French, Russian, Italian, Polish, and Spanish. Our results demonstrate …


Semantic-Based Neural Network Repair, Richard Schumi, Jun Sun Jul 2023

Semantic-Based Neural Network Repair, Richard Schumi, Jun Sun

Research Collection School Of Computing and Information Systems

Recently, neural networks have spread into numerous fields including many safety-critical systems. Neural networks are built (and trained) by programming in frameworks such as TensorFlow and PyTorch. Developers apply a rich set of pre-defined layers to manually program neural networks or to automatically generate them (e.g., through AutoML). Composing neural networks with different layers is error-prone due to the non-trivial constraints that must be satisfied in order to use those layers. In this work, we propose an approach to automatically repair erroneous neural networks. The challenge is in identifying a minimal modification to the network so that it becomes valid. …


Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo May 2023

Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo

Research Collection School Of Computing and Information Systems

Stack Overflow is a popular platform for developers to seek solutions to programming-related problems. However, prior studies identified that developers may suffer from the redundant, useless, and incomplete information retrieved by the Stack Overflow search engine. To help developers better utilize the Stack Overflow knowledge, researchers proposed tools to summarize answers to a Stack Overflow question. However, existing tools use hand-craft features to assess the usefulness of each answer sentence and fail to remove semantically redundant information in the result. Besides, existing tools only focus on a certain programming language and cannot retrieve up-to-date new posted knowledge from Stack Overflow. …


Intermediate Prototype Mining Transformer For Few-Shot Semantic Segmentation, Yuanwei Liu, Nian Liu, Xiwen Yao, Junwei Han Dec 2022

Intermediate Prototype Mining Transformer For Few-Shot Semantic Segmentation, Yuanwei Liu, Nian Liu, Xiwen Yao, Junwei Han

Computer Vision Faculty Publications

Few-shot semantic segmentation aims to segment the target objects in query under the condition of a few annotated support images. Most previous works strive to mine more effective category information from the support to match with the corresponding objects in query. However, they all ignored the category information gap between query and support images. If the objects in them show large intra-class diversity, forcibly migrating the category information from the support to the query is ineffective. To solve this problem, we are the first to introduce an intermediate prototype for mining both deterministic category information from the support and adaptive …


Transresnet: Integrating The Strengths Of Vits And Cnns For High Resolution Medical Image Segmentation Via Feature Grafting, Muhammad Hamza Sharif, Dmitry Demidov, Asif Hanif, Mohammad Yaqub, Min Xu Nov 2022

Transresnet: Integrating The Strengths Of Vits And Cnns For High Resolution Medical Image Segmentation Via Feature Grafting, Muhammad Hamza Sharif, Dmitry Demidov, Asif Hanif, Mohammad Yaqub, Min Xu

Computer Vision Faculty Publications

High-resolution images are preferable in medical imaging domain as they significantly improve the diagnostic capability of the underlying method. In particular, high resolution helps substantially in improving automatic image segmentation. However, most of the existing deep learning-based techniques for medical image segmentation are optimized for input images having small spatial dimensions and perform poorly on high-resolution images. To address this shortcoming, we propose a parallel-in-branch architecture called TransResNet, which incorporates Transformer and CNN in a parallel manner to extract features from multi-resolution images independently. In TransResNet, we introduce Cross Grafting Module (CGM), which generates the grafted features, enriched in both …


Face Pyramid Vision Transformer, Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood Nov 2022

Face Pyramid Vision Transformer, Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood

Computer Vision Faculty Publications

A novel Face Pyramid Vision Transformer (FPVT) is proposed to learn a discriminative multi-scale facial representations for face recognition and verification. In FPVT, Face Spatial Reduction Attention (FSRA) and Dimensionality Reduction (FDR) layers are employed to make the feature maps compact, thus reducing the computations. An Improved Patch Embedding (IPE) algorithm is proposed to exploit the benefits of CNNs in ViTs (e.g., shared weights, local context, and receptive fields) to model lower-level edges to higher-level semantic primitives. Within FPVT framework, a Convolutional Feed-Forward Network (CFFN) is proposed that extracts locality information to learn low level facial information. The proposed FPVT …


Shell Theory: A Statistical Model Of Reality, Wen-Yan Lin, Siying Liu, Changhao Ren, Ngai-Man Cheung, Hongdong Li, Yasuyuki Matsushita Oct 2022

Shell Theory: A Statistical Model Of Reality, Wen-Yan Lin, Siying Liu, Changhao Ren, Ngai-Man Cheung, Hongdong Li, Yasuyuki Matsushita

Research Collection School Of Computing and Information Systems

Machine learning's grand ambition is the mathematical modeling of reality. The recent years have seen major advances using deep-learned techniques that model reality implicitly; however, corresponding advances in explicit mathematical models have been noticeably lacking. We believe this dichotomy is rooted in the limitations of the current statistical tools, which struggle to make sense of the high dimensional generative processes that natural data seems to originate from. This paper proposes a new, distance based statistical technique which allows us to develop elegant mathematical models of such generative processes. Our model suggests that each semantic concept has an associated distinctive-shell which …


Editing Out-Of-Domain Gan Inversion Via Differential Activations, Haorui Song, Yong Du, Tianyi Xiang, Junyu Dong, Jing Qin, Shengfeng He Oct 2022

Editing Out-Of-Domain Gan Inversion Via Differential Activations, Haorui Song, Yong Du, Tianyi Xiang, Junyu Dong, Jing Qin, Shengfeng He

Research Collection School Of Computing and Information Systems

Despite the demonstrated editing capacity in the latent space of a pretrained GAN model, inverting real-world images is stuck in a dilemma that the reconstruction cannot be faithful to the original input. The main reason for this is that the distributions between training and real-world data are misaligned, and because of that, it is unstable of GAN inversion for real image editing. In this paper, we propose a novel GAN prior based editing framework to tackle the out-of-domain inversion problem with a composition-decomposition paradigm. In particular, during the phase of composition, we introduce a differential activation module for detecting semantic …


Cmr3d: Contextualized Multi-Stage Refinement For 3d Object Detection, Dhanalaxmi Gaddam, Jean Lahoud, Fahad Shahbaz Khan, Rao Anwer, Hisham Cholakkal Sep 2022

Cmr3d: Contextualized Multi-Stage Refinement For 3d Object Detection, Dhanalaxmi Gaddam, Jean Lahoud, Fahad Shahbaz Khan, Rao Anwer, Hisham Cholakkal

Computer Vision Faculty Publications

Existing deep learning-based 3D object detectors typically rely on the appearance of individual objects and do not explicitly pay attention to the rich contextual information of the scene. In this work, we propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework, which takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene at multiple levels to predict a set of object bounding-boxes along with their corresponding semantic labels. To this end, we propose to utilize a context enhancement network that captures the contextual information at different levels of granularity followed by a …


Dynamic Prototype Convolution Network For Few-Shot Semantic Segmentation, Jie Liu, Yanqi Bao, Guo-Sen Xie, Huan Xiong, Jan-Jakob Sonke, Efstratios Gavves Jun 2022

Dynamic Prototype Convolution Network For Few-Shot Semantic Segmentation, Jie Liu, Yanqi Bao, Guo-Sen Xie, Huan Xiong, Jan-Jakob Sonke, Efstratios Gavves

Machine Learning Faculty Publications

The key challenge for few-shot semantic segmentation (FSS) is how to tailor a desirable interaction among sup-port and query features and/or their prototypes, under the episodic training scenario. Most existing FSS methods im-plement such support/query interactions by solely leveraging plain operations - e.g., cosine similarity and feature concatenation - for segmenting the query objects. How-ever, these interaction approaches usually cannot well capture the intrinsic object details in the query images that are widely encountered in FSS, e.g., if the query object to be segmented has holes and slots, inaccurate segmentation al-most always happens. To this end, we propose a dynamic …


High-Resolution Face Swapping Via Latent Semantics Disentanglement, Yangyang Xu, Bailin Deng, Junle Wang, Yanqing Jing, Jia Pan, Shengfeng He Jun 2022

High-Resolution Face Swapping Via Latent Semantics Disentanglement, Yangyang Xu, Bailin Deng, Junle Wang, Yanqing Jing, Jia Pan, Shengfeng He

Research Collection School Of Computing and Information Systems

We present a novel high-resolution face swapping method using the inherent prior knowledge of a pre-trained GAN model. Although previous research can leverage generative priors to produce high-resolution results, their quality can suffer from the entangled semantics of the latent space. We explicitly disentangle the latent semantics by utilizing the progressive nature of the generator, deriving structure at-tributes from the shallow layers and appearance attributes from the deeper ones. Identity and pose information within the structure attributes are further separated by introducing a landmark-driven structure transfer latent direction. The disentangled latent code produces rich generative features that incorporate feature blending …


Exais: Executable Ai Semantics, Richard Schumi, Jun Sun May 2022

Exais: Executable Ai Semantics, Richard Schumi, Jun Sun

Research Collection School Of Computing and Information Systems

Neural networks can be regarded as a new programming paradigm, i.e., instead of building ever-more complex programs through (often informal) logical reasoning in the programmers' mind, complex 'AI' systems are built by optimising generic neural network models with big data. In this new paradigm, AI frameworks such as TensorFlow and PyTorch play a key role, which is as essential as the compiler for traditional programs. It is known that the lack of a proper semantics for programming languages (such as C), i.e., a correctness specification for compilers, has contributed to many problematic program behaviours and security issues. While it is …


Cocoa: Context-Conditional Adaptation For Recognizing Unseen Classes In Unseen Domains, Puneet Mangla, Shivam Chandhok, Vineeth N. Balasubramanian, Fahad Shahbaz Khan Feb 2022

Cocoa: Context-Conditional Adaptation For Recognizing Unseen Classes In Unseen Domains, Puneet Mangla, Shivam Chandhok, Vineeth N. Balasubramanian, Fahad Shahbaz Khan

Computer Vision Faculty Publications

Recent progress towards designing models that can generalize to unseen domains (i.e domain generalization) or unseen classes (i.e zero-shot learning) has embarked interest towards building models that can tackle both domain-shift and semantic shift simultaneously (i.e zero-shot domain generalization). For models to generalize to unseen classes in unseen domains, it is crucial to learn feature representation that preserves class-level (domain-invariant) as well as domain-specific information. Motivated from the success of generative zero-shot approaches, we propose a feature generative framework integrated with a COntext COnditional Adaptive (COCOA) Batch-Normalization layer to seamlessly integrate class-level semantic and domain-specific information. The generated visual features …


Transformnet: Self-Supervised Representation Learning Through Predicting Geometric Transformations, Hashim Sayed, Muhammad Ali Feb 2022

Transformnet: Self-Supervised Representation Learning Through Predicting Geometric Transformations, Hashim Sayed, Muhammad Ali

Student Publications

Deep neural networks need a big amount of training data, while in the real world there is a scarcity of data available for training purposes. To resolve this issue unsupervised methods are used for training with limited data. In this report, we describe the unsupervised semantic feature learning approach for recognition of the geometric transformation applied to the input data. The basic concept of our approach is that if someone is unaware of the objects in the images, he/she would not be able to quantitatively predict the geometric transformation that was applied to them. This self supervised scheme is based …


Cross-Modal Food Retrieval: Learning A Joint Embedding Of Food Images And Recipes With Semantic Consistency And Attention Mechanism, Hao Wang, Doyen Sahoo, Chenghao Liu, Ke Shu, Palakorn Achananuparp, Ee-Peng Lim, Steven C. H. Hoi Jan 2022

Cross-Modal Food Retrieval: Learning A Joint Embedding Of Food Images And Recipes With Semantic Consistency And Attention Mechanism, Hao Wang, Doyen Sahoo, Chenghao Liu, Ke Shu, Palakorn Achananuparp, Ee-Peng Lim, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Food retrieval is an important task to perform analysis of food-related information, where we are interested in retrieving relevant information about the queried food item such as ingredients, cooking instructions, etc. In this paper, we investigate cross-modal retrieval between food images and cooking recipes. The goal is to learn an embedding of images and recipes in a common feature space, such that the corresponding image-recipe embeddings lie close to one another. Two major challenges in addressing this problem are 1) large intra-variance and small inter-variance across cross-modal food data; and 2) difficulties in obtaining discriminative recipe representations. To address these …


Multi-Modal Transformers Excel At Class-Agnostic Object Detection, Muhammad Maaz, Hanoona Bangalath Rasheed, Salman Hameed Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang Nov 2021

Multi-Modal Transformers Excel At Class-Agnostic Object Detection, Muhammad Maaz, Hanoona Bangalath Rasheed, Salman Hameed Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang

Computer Vision Faculty Publications

What constitutes an object? This has been a longstanding question in computer vision. Towards this goal, numerous learning-free and learning-based approaches have been developed to score objectness. However, they generally do not scale well across new domains and for unseen objects. In this paper, we advocate that existing methods lack a top-down supervision signal governed by human-understandable semantics. To bridge this gap, we explore recent Multi-modal Vision Transformers (MViT) that have been trained with aligned image-text pairs. Our extensive experiments across various domains and novel objects show the state-of-the-art performance of MViTs to localize generic objects in images. Based on …


Human Parsing Based Texture Transfer From Single Image To 3d Human Via Cross-View Consistency, Fang Zhao, Shengcai Liao, Kaihao Zhang, Ling Shao Dec 2020

Human Parsing Based Texture Transfer From Single Image To 3d Human Via Cross-View Consistency, Fang Zhao, Shengcai Liao, Kaihao Zhang, Ling Shao

Machine Learning Faculty Publications

This paper proposes a human parsing based texture transfer model via cross-view consistency learning to generate the texture of 3D human body from a single image. We use the semantic parsing of human body as input for providing both the shape and pose information to reduce the appearance variation of human image and preserve the spatial distribution of semantic parts. Meanwhile, in order to improve the prediction for textures of invisible parts, we explicitly enforce the consistency across different views of the same subject by exchanging the textures predicted by two views to render images during training. The perceptual loss …


Automatic Learning Of Document Section Structure For Ontology-Based Semantic Search, Deya Banisakher Jul 2020

Automatic Learning Of Document Section Structure For Ontology-Based Semantic Search, Deya Banisakher

FIU Electronic Theses and Dissertations

Modeling natural human behavior in understanding written language is crucial for developing true artificial intelligence. For people, words convey certain semantic concepts. While documents represent an abstract concept---they are collections of text organized in some logical structure, that is, sentences, paragraphs, sections, and so on. Similar to words, these document structures, are used to convey a logical flow of semantic concepts. Machines however, only view words as spans of characters and documents as mere collections of free-text, missing any underlying meanings behind words and the logical structure of those documents.

Automatic semantic concept detection is the process by which the …


Deepfacade: A Deep Learning Approach To Facade Parsing, Hantang Liu, Jialiang Zhang, Jianke Zhu, Steven C. H. Hoi Aug 2017

Deepfacade: A Deep Learning Approach To Facade Parsing, Hantang Liu, Jialiang Zhang, Jianke Zhu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

The parsing of building facades is a key component to the problem of 3D street scenes reconstruction, which is long desired in computer vision. In this paper, we propose a deep learning based method for segmenting a facade into semantic categories. Man-made structures often present the characteristic of symmetry. Based on this observation, we propose a symmetric regularizer for training the neural network. Our proposed method can make use of both the power of deep neural networks and the structure of man-made architectures. We also propose a method to refine the segmentation results using bounding boxes generated by the Region …


Formresnet: Formatted Residual Learning For Image Restoration, Jianbo Jiao, Wei-Chih Tu, Shengfeng He Aug 2017

Formresnet: Formatted Residual Learning For Image Restoration, Jianbo Jiao, Wei-Chih Tu, Shengfeng He

Research Collection School Of Computing and Information Systems

In this paper, we propose a deep CNN to tackle the image restoration problem by learning the structured residual. Previous deep learning based methods directly learn the mapping from corrupted images to clean images, and may suffer from the gradient exploding/vanishing problems of deep neural networks. We propose to address the image restoration problem by learning the structured details and recovering the latent clean image together, from the shared information between the corrupted image and the latent image. In addition, instead of learning the pure difference (corruption), we propose to add a 'residual formatting layer' to format the residual to …


Deshadownet: A Multi-Context Embedding Deep Network For Shadow Removal, Liangqiong Qu, Jiandong Tian, Shengfeng He, Yandong Tang, Rynson W. H. Lau Jul 2017

Deshadownet: A Multi-Context Embedding Deep Network For Shadow Removal, Liangqiong Qu, Jiandong Tian, Shengfeng He, Yandong Tang, Rynson W. H. Lau

Research Collection School Of Computing and Information Systems

Shadow removal is a challenging task as it requires the detection/annotation of shadows as well as semantic understanding of the scene. In this paper, we propose an automatic and end-to-end deep neural network (DeshadowNet) to tackle these problems in a unified manner. DeshadowNet is designed with a multi-context architecture, where the output shadow matte is predicted by embedding information from three different perspectives. The first global network extracts shadow features from a global view. Two levels of features are derived from the global network and transferred to two parallel networks. While one extracts the appearance of the input image, the …


Computing The Grounded Semantics In All The Subgraphs Of An Argumentation Framework: An Empirical Evaluation, Pierpaolo Dondio Sep 2013

Computing The Grounded Semantics In All The Subgraphs Of An Argumentation Framework: An Empirical Evaluation, Pierpaolo Dondio

Articles

Given an argumentation framework – with a finite set of arguments and the attack relation identifying the graph – we study how the grounded labelling of a generic argument a varies in all the subgraphs of . Since this is an intractable problem of above-polynomial complexity, we present two non-naïve algorithms to find the set of all the subgraphs where the grounded semantic assigns to argument a specific label . We report the results of a series of empirical tests over graphs of increasing complexity. The value of researching the above problem is two-fold. First, knowing how an argument behaves …


Proceedings Of The 4th Acl-Sigsem Workshop On Prepositions At Acl-2007., Fintan Costello, John D. Kelleher, Martin Volk Jan 2007

Proceedings Of The 4th Acl-Sigsem Workshop On Prepositions At Acl-2007., Fintan Costello, John D. Kelleher, Martin Volk

Conference papers

This volume contains the papers presented at the Fourth ACL-SIGSEM Workshop on Prepositions. This workshop is endorsed by the ACL Special Interest Group on Semantics (ACL-SIGSEM), and is hosted in conjunction with ACL 2007, taking place on 28th June, 2007 in Prague, the Czech Republic.