Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 18 of 18

Full-Text Articles in Physical Sciences and Mathematics

Learning Hierarchical Metrical Structure Beyond Measures, Junyan Jiang, Daniel Chin, Yixiao Zhang, Gus Xia Sep 2022

Learning Hierarchical Metrical Structure Beyond Measures, Junyan Jiang, Daniel Chin, Yixiao Zhang, Gus Xia

Machine Learning Faculty Publications

Music contains hierarchical structures beyond beats and measures. While hierarchical structure annotations are helpful for music information retrieval and computer musicology, such annotations are scarce in current digital music databases. In this paper, we explore a data-driven approach to automatically extract hierarchical metrical structures from scores. We propose a new model with a Temporal Convolutional Network-Conditional Random Field (TCN-CRF) architecture. Given a symbolic music score, our model takes in an arbitrary number of voices in a beat-quantized form, and predicts a 4-level hierarchical metrical structure from downbeat-level to section-level. We also annotate a dataset using RWC-POP MIDI files to facilitate …


Self-Distilled Vision Transformer For Domain Generalization, Maryam Sultana, Muzammal Naseer, Muhammad Haris Khan, Salman Khan, Fahad Shahbaz Khan Jul 2022

Self-Distilled Vision Transformer For Domain Generalization, Maryam Sultana, Muzammal Naseer, Muhammad Haris Khan, Salman Khan, Fahad Shahbaz Khan

Computer Vision Faculty Publications

In recent past, several domain generalization (DG) methods have been proposed, showing encouraging performance, however, almost all of them build on convolutional neural networks (CNNs). There is little to no progress on studying the DG performance of vision transformers (ViTs), which are challenging the supremacy of CNNs on standard benchmarks, often built on i.i.d assumption. This renders the real-world deployment of ViTs doubtful. In this paper, we attempt to explore ViTs towards addressing the DG problem. Similar to CNNs, ViTs also struggle in out-of-distribution scenarios and the main culprit is overfitting to source domains. Inspired by the modular architecture of …


Learning To Generalize Dispatching Rules On The Job Shop Scheduling, Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac Jun 2022

Learning To Generalize Dispatching Rules On The Job Shop Scheduling, Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac

Machine Learning Faculty Publications

This paper introduces a Reinforcement Learning approach to better generalize heuristic dispatching rules on the Job-shop Scheduling Problem (JSP). Current models on the JSP do not focus on generalization, although, as we show in this work, this is key to learning better heuristics on the problem. A well-known technique to improve generalization is to learn on increasingly complex instances using Curriculum Learning (CL). However, as many works in the literature indicate, this technique might suffer from catastrophic forgetting when transferring the learned skills between different problem sizes. To address this issue, we introduce a novel Adversarial Curriculum Learning (ACL) strategy, …


Learning To Control Under Time-Varying Environment, Yuzhen Han, Ruben Solozabal, Jing Dong, Xingyu Zhou, Martin Takac, Bin Gu Jun 2022

Learning To Control Under Time-Varying Environment, Yuzhen Han, Ruben Solozabal, Jing Dong, Xingyu Zhou, Martin Takac, Bin Gu

Machine Learning Faculty Publications

This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from nonparametric rate of regret. In this paper, we propose the first computationally tractable online algorithm with regret guarantees that avoids offline planning over the state linear feedback policies. Our algorithm is based on the optimism in the face of uncertainty (OFU) principle in which we optimistically …


Offline Reinforcement Learning With Causal Structured World Models, Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu Jun 2022

Offline Reinforcement Learning With Causal Structured World Models, Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu

Machine Learning Faculty Publications

Model-based methods have recently shown promising for offline reinforcement learning (RL), aiming to learn good policies from historical data without interacting with the environment. Previous model-based offline RL methods learn fully connected nets as world-models to map the states and actions to the next-step states. However, it is sensible that a world-model should adhere to the underlying causal effect such that it will support learning an effective policy generalizing well in unseen states. In this paper, We first provide theoretical results that causal world-models can outperform plain world-models for offline RL by incorporating the causal structure into the generalization error …


Visual Attention Methods In Deep Learning: An In-Depth Survey, Mohammed Hassanin, Anwar Saeed, Ibrahim Radwan, Fahad Shahbaz Khan, Ajmal Mian Apr 2022

Visual Attention Methods In Deep Learning: An In-Depth Survey, Mohammed Hassanin, Anwar Saeed, Ibrahim Radwan, Fahad Shahbaz Khan, Ajmal Mian

Computer Vision Faculty Publications

Inspired by the human cognitive system, attention is a mechanism that imitates the human cognitive awareness about specific information, amplifying critical details to focus more on the essential aspects of data. Deep learning has employed attention to boost performance for many applications. Interestingly, the same attention design can suit processing different data modalities and can easily be incorporated into large networks. Furthermore, multiple complementary attention mechanisms can be incorporated in one network. Hence, attention techniques have become extremely attractive. However, the literature lacks a comprehensive survey specific to attention techniques to guide researchers in employing attention in their deep models. …


Mucot: Multilingual Contrastive Training For Question-Answering In Low-Resource Languages, Gokul Karthik Kumar, Abhishek Singh Gehlot, Sahal Shaji Mullappilly, Karthik Nandakumar Apr 2022

Mucot: Multilingual Contrastive Training For Question-Answering In Low-Resource Languages, Gokul Karthik Kumar, Abhishek Singh Gehlot, Sahal Shaji Mullappilly, Karthik Nandakumar

Computer Vision Faculty Publications

Accuracy of English-language Question Answering (QA) systems has improved significantly in recent years with the advent of Transformer-based models (e.g., BERT). These models are pre-trained in a self-supervised fashion with a large English text corpus and further fine-tuned with a massive English QA dataset (e.g., SQuAD). However, QA datasets on such a scale are not available for most of the other languages. Multi-lingual BERT-based models (mBERT) are often used to transfer knowledge from high-resource languages to low-resource languages. Since these models are pre-trained with huge text corpora containing multiple languages, they typically learn language-agnostic embeddings for tokens from different languages. …


Multimodal Multi-Head Convolutional Attention With Various Kernel Sizes For Medical Image Super-Resolution, Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Andreea-Iuliana Miron, Olivian Savencu, Nicolae Verga, Nicolae-Cătălin Ristea, Fahad Shabaz Khan Apr 2022

Multimodal Multi-Head Convolutional Attention With Various Kernel Sizes For Medical Image Super-Resolution, Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Andreea-Iuliana Miron, Olivian Savencu, Nicolae Verga, Nicolae-Cătălin Ristea, Fahad Shabaz Khan

Computer Vision Faculty Publications

Super-resolving medical images can help physicians in providing more accurate diagnostics. In many situations, computed tomography (CT) or magnetic resonance imaging (MRI) techniques output several scans (modes) during a single investigation, which can jointly be used (in a multimodal fashion) to further boost the quality of super-resolution results. To this end, we propose a novel multimodal multi-head convolutional attention module to super-resolve CT and MRI scans. Our attention module uses the convolution operation to perform joint spatial-channel attention on multiple concatenated input tensors, where the kernel (receptive field) size controls the reduction rate of the spatial attention and the number …


Energy-Based Latent Aligner For Incremental Learning, K.J. Joseph, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Vineeth N. Balasubramanian Mar 2022

Energy-Based Latent Aligner For Incremental Learning, K.J. Joseph, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Vineeth N. Balasubramanian

Computer Vision Faculty Publications

Deep learning models tend to forget their earlier knowledge while incrementally learning new tasks. This behavior emerges because the parameter updates optimized for the new tasks may not align well with the updates suitable for older tasks. The resulting latent representation mismatch causes forgetting. In this work, we propose ELI: Energy-based Latent Aligner for Incremental Learning, which first learns an energy manifold for the latent representations such that previous task latents will have low energy and the current task latents have high energy values. This learned manifold is used to counter the representational shift that happens during incremental learning. The …


Robustness Analysis Of Classification Using Recurrent Neural Networks With Perturbed Sequential Input, Guangyi Liu, Arash Amini, Martin Takac, Nader Motee Mar 2022

Robustness Analysis Of Classification Using Recurrent Neural Networks With Perturbed Sequential Input, Guangyi Liu, Arash Amini, Martin Takac, Nader Motee

Machine Learning Faculty Publications

For a given stable recurrent neural network (RNN) that is trained to perform a classification task using sequential inputs, we quantify explicit robustness bounds as a function of trainable weight matrices. The sequential inputs can be perturbed in various ways, e.g., streaming images can be deformed due to robot motion or imperfect camera lens. Using the notion of the Voronoi diagram and Lipschitz properties of stable RNNs, we provide a thorough analysis and characterize the maximum allowable perturbations while guaranteeing the full accuracy of the classification task. We illustrate and validate our theoretical results using a map dataset with clouds …


An Ensemble Approach For Patient Prognosis Of Head And Neck Tumor Using Multimodal Data, Numan Saeed, Roba Al Majzoub, Ikboljon Sobirov, Mohammad Yaqub Feb 2022

An Ensemble Approach For Patient Prognosis Of Head And Neck Tumor Using Multimodal Data, Numan Saeed, Roba Al Majzoub, Ikboljon Sobirov, Mohammad Yaqub

Computer Vision Faculty Publications

Accurate prognosis of a tumor can help doctors provide a proper course of treatment and, therefore, save the lives of many. Tradi-tional machine learning algorithms have been eminently useful in crafting prognostic models in the last few decades. Recently, deep learning algorithms have shown significant improvement when developing diag-nosis and prognosis solutions to different healthcare problems. However, most of these solutions rely solely on either imaging or clinical data. Utilizing patient tabular data such as demographics and patient med-ical history alongside imaging data in a multimodal approach to solve a prognosis task has started to gain more interest recently and …


Subomiembed: Self-Supervised Representation Learning Of Multi-Omics Data For Cancer Type Classification, Sayed Hashim, Muhammad Ali, Karthik Nandakumar, Mohammad Yaqub Feb 2022

Subomiembed: Self-Supervised Representation Learning Of Multi-Omics Data For Cancer Type Classification, Sayed Hashim, Muhammad Ali, Karthik Nandakumar, Mohammad Yaqub

Computer Vision Faculty Publications

For personalized medicines, very crucial intrinsic information is present in high dimensional omics data which is difficult to capture due to the large number of molecular features and small number of available samples. Different types of omics data show various aspects of samples. Integration and analysis of multi-omics data give us a broad view of tumours, which can improve clinical decision making. Omics data, mainly DNA methylation and gene expression profiles are usually high dimensional data with a lot of molecular features. In recent years, variational autoencoders (VAE) [13] have been extensively used in embedding image and text data into …


Hyperparameter Optimization For Covid-19 Chest X-Ray Classification, Ibraheem Hamdi, Muhammad Ridzuan, Mohammad Yaqub Jan 2022

Hyperparameter Optimization For Covid-19 Chest X-Ray Classification, Ibraheem Hamdi, Muhammad Ridzuan, Mohammad Yaqub

Computer Vision Faculty Publications

Despite the introduction of vaccines, Coronavirus disease (COVID-19) remains a worldwide dilemma, continuously developing new variants such as Delta and the recent Omicron. The current standard for testing is through polymerase chain reaction (PCR). However, PCRs can be expensive, slow, and/or inaccessible to many people. X-rays on the other hand have been readily used since the early 20th century and are relatively cheaper, quicker to obtain, and typically covered by health insurance. With a careful selection of model, hyperparameters, and augmentations, we show that it is possible to develop models with 83% accuracy in binary classification and 64% in multi-class …


Optimal Transport For Causal Discovery, Ruibo Tu, Kun Zhang, Hedvig Kjellström, Cheng Zhang Jan 2022

Optimal Transport For Causal Discovery, Ruibo Tu, Kun Zhang, Hedvig Kjellström, Cheng Zhang

Machine Learning Faculty Publications

To determine causal relationships between two variables, approaches based on Functional Causal Models (FCMs) have been proposed by properly restricting model classes; however, the performance is sensitive to the model assumptions, which makes it difficult to use. In this paper, we provide a novel dynamical-system view of FCMs and propose a new framework for identifying causal direction in the bivariate case. We first show the connection between FCMs and optimal transport, and then study optimal transport under the constraints of FCMs. Furthermore, by exploiting the dynamical interpretation of optimal transport under the FCM constraints, we determine the corresponding underlying dynamical …


Transformers In Vision: A Survey, Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, Mubarak Shah Jan 2022

Transformers In Vision: A Survey, Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, Mubarak Shah

Computer Vision Faculty Publications

Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks e.g., Long short-term memory (LSTM). Different from convolutional networks, Transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the straightforward design of Transformers allows processing multiple modalities (e.g., images, videos, text and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge …


Automatic Segmentation Of Head And Neck Tumor: How Powerful Transformers Are?, Ikboljon Sobirov, Otabek Nazarov, Hussain Alasmawi, Mohammad Yaqub Jan 2022

Automatic Segmentation Of Head And Neck Tumor: How Powerful Transformers Are?, Ikboljon Sobirov, Otabek Nazarov, Hussain Alasmawi, Mohammad Yaqub

Computer Vision Faculty Publications

Cancer is one of the leading causes of death worldwide, and head and neck (H&N) cancer is amongst the most prevalent types. Positron emission tomography and computed tomography are used to detect and segment the tumor region. Clinically, tumor segmentation is extensively time-consuming and prone to error. Machine learning, and deep learning in particular, can assist to automate this process, yielding results as accurate as the results of a clinician. In this research study, we develop a vision transformers-based method to automatically delineate H&N tumor, and compare its results to leading convolutional neural network (CNN)-based models. We use multi-modal data …


Is Contrastive Learning Suitable For Left Ventricular Segmentation In Echocardiographic Images?, Mohamed Saeed, Rand Muhtaseb, Mohammad Yaqub Jan 2022

Is Contrastive Learning Suitable For Left Ventricular Segmentation In Echocardiographic Images?, Mohamed Saeed, Rand Muhtaseb, Mohammad Yaqub

Computer Vision Faculty Publications

Contrastive learning has proven useful in many applications where access to labelled data is limited. The lack of annotated data is particularly problematic in medical image segmenta-tion as it is difficult to have clinical experts manually annotate large volumes of data. One such task is the segmentation of cardiac structures in ultrasound images of the heart. In this paper, we argue whether or not contrastive pretraining is helpful for the segmentation of the left ventricle in echocardiography images. Furthermore, we study the effect of this on two segmentation networks, DeepLabV3, as well as the commonly used segmentation net-work, UNet. Our …


Challenges In Covid-19 Chest X-Ray Classification: Problematic Data Or Ineffective Approaches?, Muhammad Ridzuan, Ameera Ali Bawazir, Ivo Gollini Navarrete, Ibrahim Almakky, Mohammad Yaqub Jan 2022

Challenges In Covid-19 Chest X-Ray Classification: Problematic Data Or Ineffective Approaches?, Muhammad Ridzuan, Ameera Ali Bawazir, Ivo Gollini Navarrete, Ibrahim Almakky, Mohammad Yaqub

Computer Vision Faculty Publications

The value of quick, accurate, and confident diagnoses cannot be undermined to mitigate the effects of COVID-19 infection, particularly for severe cases. Enormous effort has been put towards developing deep learning methods to classify and detect COVID-19 infections from chest radiography images. However, recently some questions have been raised surrounding the clinical viability and effectiveness of such methods. In this work, we carry out extensive experiments on a large COVID-19 chest X-ray dataset to investigate the challenges faced with creating reliable solutions from both the data and machine learning perspectives. Accordingly, we offer an in-depth discussion into the challenges faced …