Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 83

Full-Text Articles in Artificial Intelligence and Robotics

Bare-Bones Based Salp Swarm Algorithm For Text Document Clustering, Mohammed Azmi Al-Betar, Ammar Kamal Abasi, Ghazi Al-Naymat, Kamran Arshad, Sharif Naser Makhadmeh Sep 2023

Bare-Bones Based Salp Swarm Algorithm For Text Document Clustering, Mohammed Azmi Al-Betar, Ammar Kamal Abasi, Ghazi Al-Naymat, Kamran Arshad, Sharif Naser Makhadmeh

Machine Learning Faculty Publications

Text Document Clustering (TDC) is a challenging optimization problem in unsupervised machine learning and text mining. The Salp Swarm Algorithm (SSA) has been found to be effective in solving complex optimization problems. However, the SSA’s exploitation phase requires improvement to solve the TDC problem effectively. In this paper, we propose a new approach, known as the Bare-Bones Salp Swarm Algorithm (BBSSA), which leverages Gaussian search equations, inverse hyperbolic cosine control strategies, and greedy selection techniques to create new individuals and guide the population towards solving the TDC problem. We evaluated the performance of the BBSSA on six benchmark datasets from …


A Study On Feature Selection Using Multi-Domain Feature Extraction For Automated K-Complex Detection, Yabing Li, Xinglong Dong, Kun Song, Xiangyun Bai, Hongye Li, Fakhreddine Karray Sep 2023

A Study On Feature Selection Using Multi-Domain Feature Extraction For Automated K-Complex Detection, Yabing Li, Xinglong Dong, Kun Song, Xiangyun Bai, Hongye Li, Fakhreddine Karray

Machine Learning Faculty Publications

Background: K-complex detection plays a significant role in the field of sleep research. However, manual annotation for electroencephalography (EEG) recordings by visual inspection from experts is time-consuming and subjective. Therefore, there is a necessity to implement automatic detection methods based on classical machine learning algorithms. However, due to the complexity of EEG signal, current feature extraction methods always produce low relevance to k-complex detection, which leads to a great performance loss for the detection. Hence, finding compact yet effective integrated feature vectors becomes a crucially core task in k-complex detection. Method: In this paper, we first extract multi-domain features based …


Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac Aug 2023

Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac

Machine Learning Faculty Publications

We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the …


A Multi-Layer Information Dissemination Model And Interference Optimization Strategy For Communication Networks In Disaster Areas, Yuexia Zhang, Yang Hong, Mohsen Guizani, Sheng Wu, Peiying Zhang, Ruiqi Liu Aug 2023

A Multi-Layer Information Dissemination Model And Interference Optimization Strategy For Communication Networks In Disaster Areas, Yuexia Zhang, Yang Hong, Mohsen Guizani, Sheng Wu, Peiying Zhang, Ruiqi Liu

Machine Learning Faculty Publications

The communication network in disaster areas (CNDA) can disseminate the key disaster information in time and provide basic information support for decision-making and rescuing. Therefore, it is of great significance to study the information dissemination mechanism of CNDA. However, a CNDA is vulnerable to interference, which affects information dissemination and rescuing. To solve this problem, this paper established a multi-layer information dissemination model of CNDA (MMND) which models the CNDA from the perspective of degree distribution of nodes. The information dissemination process and equilibrium state in CNDA is analyzed by an improved dynamic dissemination method. Then, the effects of the …


Arabic Dysarthric Speech Recognition Using Adversarial And Signal-Based Augmentation, Massa Baali, Ibrahim Almakky, Shady Shehata, Fakhri Karray Aug 2023

Arabic Dysarthric Speech Recognition Using Adversarial And Signal-Based Augmentation, Massa Baali, Ibrahim Almakky, Shady Shehata, Fakhri Karray

Machine Learning Faculty Publications

Despite major advancements in Automatic Speech Recognition (ASR), the state-of-the-art ASR systems struggle to deal with impaired speech even with high-resource languages. In Arabic, this challenge gets amplified, with added complexities in collecting data from dysarthric speakers. In this paper, we aim to improve the performance of Arabic dysarthric automatic speech recognition through a multi-stage augmentation approach. To this effect, we first propose a signal-based approach to generate dysarthric Arabic speech from healthy Arabic speech by modifying its speed and tempo. We also propose a second stage Parallel Wave Generative (PWG) adversarial model that is trained on an English dysarthric …


Fooctts: Generating Arabic Speech With Acoustic Environment For Football Commentator, Massa Baali, Ahmed Ali Aug 2023

Fooctts: Generating Arabic Speech With Acoustic Environment For Football Commentator, Massa Baali, Ahmed Ali

Machine Learning Faculty Publications

This paper presents FOOCTTS, an automatic pipeline for a football commentator that generates speech with background crowd noise. The application gets the text from the user, applies text pre-processing such as vowelization, followed by the commentator's speech synthesizer. Our pipeline included Arabic automatic speech recognition for data labeling, CTC segmentation, transcription vowelization to match speech, and fine-tuning the TTS. Our system is capable of generating speech with its acoustic environment within limited 15 minutes of football commentator recording. Our prototype is generalizable and can be easily applied to different domains and languages.


S2cd: Self-Heuristic Speaker Content Disentanglement For Any-To-Any Voice Conversion, Pengfei Wei, Xiang Yin, Chunfeng Wang, Zhonghao Li, Xinghua Qu, Zhiqiang Xu, Zejun Ma Aug 2023

S2cd: Self-Heuristic Speaker Content Disentanglement For Any-To-Any Voice Conversion, Pengfei Wei, Xiang Yin, Chunfeng Wang, Zhonghao Li, Xinghua Qu, Zhiqiang Xu, Zejun Ma

Machine Learning Faculty Publications

In this paper, we propose a Self-heuristic Speaker Content Disentanglement (S2CD) model for any to any voice conversion without using any external resources, e.g., speaker labels or vectors, linguistic models, and transcriptions. S2CD is built on the disentanglement sequential variational autoencoder (DSVAE), but improves DSVAE structure at the model architecture level from three perspectives. Specifically, we develop different structures for speaker and content encoders based on their underlying static/dynamic property. We further propose a generative graph, modelled by S2CD, so as to make S2CD well mimic the multi-speaker speech generation process. Finally, we propose a self-heuristic way to introduce bias …


Linear Classifier: An Often-Forgotten Baseline For Text Classification, Yu Chen Lin, Si An Chen, Jie Jyun Liu, Chih Jen Lin Jul 2023

Linear Classifier: An Often-Forgotten Baseline For Text Classification, Yu Chen Lin, Si An Chen, Jie Jyun Liu, Chih Jen Lin

Machine Learning Faculty Publications

Large-scale pre-trained language models such as BERT are popular solutions for text classification. Due to the superior performance of these advanced methods, nowadays, people often directly train them for a few epochs and deploy the obtained model. In this opinion paper, we point out that this way may only sometimes get satisfactory results. We argue the importance of running a simple baseline like linear classifiers on bag-of-words features along with advanced methods. First, for many text data, linear methods show competitive performance, high efficiency, and robustness. Second, advanced models such as BERT may only achieve the best results if properly …


Adversarial Alignment For Source Free Object Detection, Qiaosong Chu, Shuyan Li, Guangyi Chen, Kai Li, Xiu Li Jun 2023

Adversarial Alignment For Source Free Object Detection, Qiaosong Chu, Shuyan Li, Guangyi Chen, Kai Li, Xiu Li

Machine Learning Faculty Publications

Source-free object detection (SFOD) aims to transfer a detector pre-trained on a label-rich source domain to an unlabeled target domain without seeing source data. While most existing SFOD methods generate pseudo labels via a source-pretrained model to guide training, these pseudo labels usually contain high noises due to heavy domain discrepancy. In order to obtain better pseudo supervisions, we divide the target domain into source-similar and source-dissimilar parts and align them in the feature space by adversarial learning. Specifically, we design a detection variance-based criterion to divide the target domain. This criterion is motivated by a finding that larger detection …


Corruption-Tolerant Algorithms For Generalized Linear Models, Bhaskar Mukhoty, Debojyoti Dey, Purushottam Kar Jun 2023

Corruption-Tolerant Algorithms For Generalized Linear Models, Bhaskar Mukhoty, Debojyoti Dey, Purushottam Kar

Machine Learning Faculty Publications

This paper presents SVAM (Sequential Variance-Altered MLE), a unified framework for learning generalized linear models under adversarial label corruption in training data. SVAM extends to tasks such as least squares regression, logistic regression, and gamma regression, whereas many existing works on learning with label corruptions focus only on least squares regression. SVAM is based on a novel variance reduction technique that may be of independent interest and works by iteratively solving weighted MLEs over variance-altered versions of the GLM objective. SVAM offers provable model recovery guarantees superior to the state-of-the-art for robust regression even when a constant fraction of training …


Stability-Based Generalization Analysis For Mixtures Of Pointwise And Pairwise Learning, Jiahuan Wang, Jun Chen, Hong Chen, Bin Gu, Weifu Li, Xin Tang Jun 2023

Stability-Based Generalization Analysis For Mixtures Of Pointwise And Pairwise Learning, Jiahuan Wang, Jun Chen, Hong Chen, Bin Gu, Weifu Li, Xin Tang

Machine Learning Faculty Publications

Recently, some mixture algorithms of pointwise and pairwise learning (PPL) have been formulated by employing the hybrid error metric of “pointwise loss + pairwise loss” and have shown empirical effectiveness on feature selection, ranking and recommendation tasks. However, to the best of our knowledge, the learning theory foundation of PPL has not been touched in the existing works. In this paper, we try to fill this theoretical gap by investigating the generalization properties of PPL. After extending the definitions of algorithmic stability to the PPL setting, we establish the high-probability generalization bounds for uniformly stable PPL algorithms. Moreover, explicit convergence …


Joint Flood Risks In The Grand River Watershed, Poornima Unnikrishnan, Kumaraswamy Ponnambalam, Nirupama Agrawal, Fakhri Karray Jun 2023

Joint Flood Risks In The Grand River Watershed, Poornima Unnikrishnan, Kumaraswamy Ponnambalam, Nirupama Agrawal, Fakhri Karray

Machine Learning Faculty Publications

According to the World Meteorological Organization, since 2000, there has been an increase in global flood-related disasters by 134 percent compared to the previous decades. Efficient flood risk management strategies necessitate a holistic approach to evaluating flood vulnerabilities and risks. Catastrophic losses can occur when the peak flow values in the rivers in a basin coincide. Therefore, estimating the joint flood risks in a region is vital, especially when frequent occurrences of extreme events are experienced. This study focuses on estimating the joint flood risks due to river flow extremes in the Grand River watershed in Canada. For this purpose, …


On The Accelerated Noise-Tolerant Power Method, Zhiqiang Xu Apr 2023

On The Accelerated Noise-Tolerant Power Method, Zhiqiang Xu

Machine Learning Faculty Publications

We revisit the acceleration of the noise-tolerant power method for which, despite previous studies, the results remain unsatisfactory as they are either wrong or suboptimal, also lacking generality. In this work, we present a simple yet general and optimal analysis via noise-corrupted Chebyshev polynomials, which allows a larger iteration rank p than the target rank k, requires less noise conditions in a new form, and achieves the optimal iteration complexity (Equation presented) for some q satisfying k ≤ q ≤ p in a certain regime of the momentum parameter. Interestingly, it shows dynamic dependence of the noise tolerance on the …


Towards Carbon Neutrality: Prediction Of Wave Energy Based On Improved Gru In Maritime Transportation, Zhihan Lv, Nana Wang, Ranran Lou, Yajun Tian, Mohsen Guizani Feb 2023

Towards Carbon Neutrality: Prediction Of Wave Energy Based On Improved Gru In Maritime Transportation, Zhihan Lv, Nana Wang, Ranran Lou, Yajun Tian, Mohsen Guizani

Machine Learning Faculty Publications

Efficient use of renewable energy is one of the critical measures to achieve carbon neutrality. Countries have introduced policies to put carbon neutrality on the agenda to achieve relatively zero emissions of greenhouse gases and to cope with the crisis brought about by global warming. This work analyzes the wave energy with high energy density and wide distribution based on understanding of various renewable energy sources. This study provides a wave energy prediction model for energy harvesting. At the same time, the Gated Recurrent Unit network (GRU), Bayesian optimization algorithm, and attention mechanism are introduced to improve the model's performance. …


Channel-Resilient Deep-Learning-Driven Device Fingerprinting Through Multiple Data Streams, Nora Basha, Bechir Hamdaoui, Kathiravetpillai Sivanesan, Mohsen Guizani Jan 2023

Channel-Resilient Deep-Learning-Driven Device Fingerprinting Through Multiple Data Streams, Nora Basha, Bechir Hamdaoui, Kathiravetpillai Sivanesan, Mohsen Guizani

Machine Learning Faculty Publications

Enabling accurate and automated identification of wireless devices is critical for allowing network access monitoring and ensuring data authentication for large-scale IoT networks. RF fingerprinting has emerged as a solution for device identification by leveraging the transmitters' inevitable hardware impairments that occur during manufacturing. Although deep learning is proven efficient in classifying devices based on hardware impairments, the performance of deep learning models suffers greatly from variations of the wireless channel conditions, across time and space. To the best of our knowledge, we are the first to propose leveraging MIMO capabilities to mitigate the channel effect and provide a channel-resilient …


Differentially Private Stochastic Convex Optimization In (Non)-Euclidean Space Revisited, Jinyan Su, Changhong Zhao, Di Wang Jan 2023

Differentially Private Stochastic Convex Optimization In (Non)-Euclidean Space Revisited, Jinyan Su, Changhong Zhao, Di Wang

Machine Learning Faculty Publications

In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) in Euclidean and general `dp spaces. Specifically, we focus on three settings that are still far from well understood: (1) DP-SCO over a constrained and bounded (convex) set in Euclidean space; (2) unconstrained DP-SCO in `dp space; (3) DP-SCO with heavy-tailed data over a constrained and bounded set in `dp space. For problem (1), for both convex and strongly convex loss functions, we propose methods whose outputs could achieve (expected) excess population risks that are only dependent on the Gaussian width of the constraint set, rather …


A Hybrid Artificial Intelligence Model For Detecting Keratoconus, Zaid Abdi Alkareem Alyasseri, Ali H. Al-Timemy, Ammar Kamal Abasi, Alexandru Lavric, Husam Jasim Mohammed, Hidenori Takahashi, Jose Arthur Milhomens Filho, Mauro Campos, Rossen M. Hazarbassanov, Siamak Yousefi Dec 2022

A Hybrid Artificial Intelligence Model For Detecting Keratoconus, Zaid Abdi Alkareem Alyasseri, Ali H. Al-Timemy, Ammar Kamal Abasi, Alexandru Lavric, Husam Jasim Mohammed, Hidenori Takahashi, Jose Arthur Milhomens Filho, Mauro Campos, Rossen M. Hazarbassanov, Siamak Yousefi

Machine Learning Faculty Publications

Machine learning models have recently provided great promise in diagnosis of several ophthalmic disorders, including keratoconus (KCN). Keratoconus, a noninflammatory ectatic corneal disorder characterized by progressive cornea thinning, is challenging to detect as signs may be subtle. Several machine learning models have been proposed to detect KCN, however most of the models are supervised and thus require large well-annotated data. This paper proposes a new unsupervised model to detect KCN, based on adapted flower pollination algorithm (FPA) and the k-means algorithm. We will evaluate the proposed models using corneal data collected from 5430 eyes at different stages of KCN severity …


Impact Of Digital Twins And Metaverse On Cities: History, Current Situation, And Application Perspectives, Zhihan Lv, Wen Long Shang, Mohsen Guizani Dec 2022

Impact Of Digital Twins And Metaverse On Cities: History, Current Situation, And Application Perspectives, Zhihan Lv, Wen Long Shang, Mohsen Guizani

Machine Learning Faculty Publications

To promote the expansion and adoption of Digital Twins (DTs) in Smart Cities (SCs), a detailed review of the impact of DTs and digitalization on cities is made to assess the progression of cities and standardization of their management mode. Combined with the technical elements of DTs, the coupling effect of DTs technology and urban construction and the internal logic of DTs technology embedded in urban construction are discussed. Relevant literature covering the full range of DTs technologies and their applications is collected, evaluated, and collated, relevant studies are concatenated, and relevant accepted conclusions are summarized by modules. First, the …


A Damped Newton Method Achieves Global O(1/K2) And Local Quadratic Convergence Rate, Slavomír Hanzely, Dmitry Kamzolov, Dmitry Pasechnyuk, Alexander Gasnikov, Peter Richtárik, Martin Takáč Dec 2022

A Damped Newton Method Achieves Global O(1/K2) And Local Quadratic Convergence Rate, Slavomír Hanzely, Dmitry Kamzolov, Dmitry Pasechnyuk, Alexander Gasnikov, Peter Richtárik, Martin Takáč

Machine Learning Faculty Publications

In this paper, we present the first stepsize schedule for Newton method resulting in fast global and local convergence guarantees. In particular, a) we prove an O (1/k2) global rate, which matches the state-of-the-art global rate of cubically regularized Newton method of Polyak and Nesterov (2006) and of regularized Newton method of Mishchenko (2021) and Doikov and Nesterov (2021), b) we prove a local quadratic rate, which matches the best-known local rate of second-order methods, and c) our stepsize formula is simple, explicit, and does not require solving any subproblem. Our convergence proofs hold under affine-invariance assumptions closely related to …


Amp: Automatically Finding Model Parallel Strategies With Heterogeneity Awareness, Dacheng Li, Hongyi Wang, Eric Xing, Hao Zhang Dec 2022

Amp: Automatically Finding Model Parallel Strategies With Heterogeneity Awareness, Dacheng Li, Hongyi Wang, Eric Xing, Hao Zhang

Machine Learning Faculty Publications

Scaling up model sizes can lead to fundamentally new capabilities in many machine learning (ML) tasks. However, training big models requires strong distributed system expertise to carefully design model-parallel execution strategies that suit the model architectures and cluster setups. In this paper, we develop AMP, a framework that automatically derives such strategies. AMP identifies a valid space of model parallelism strategies and efficiently searches the space for high-performed strategies, by leveraging a cost model designed to capture the heterogeneity of the model and cluster specifications. Unlike existing methods, AMP is specifically tailored to support complex models composed of uneven layers …


Automs: Automatic Model Selection For Novelty Detection With Error Rate Control, Yifan Zhang, Haiyan Jiang, Haojie Ren, Changliang Zou, Dejing Dou Dec 2022

Automs: Automatic Model Selection For Novelty Detection With Error Rate Control, Yifan Zhang, Haiyan Jiang, Haojie Ren, Changliang Zou, Dejing Dou

Machine Learning Faculty Publications

Given an unsupervised novelty detection task on a new dataset, how can we automatically select a “best” detection model while simultaneously controlling the error rate of the best model? For novelty detection analysis, numerous detectors have been proposed to detect outliers on a new unseen dataset based on a score function trained on available clean data. However, due to the absence of labeled anomalous data for model evaluation and comparison, there is a lack of systematic approaches that are able to select the “best” model/detector (i.e., the algorithm as well as its hyperparameters) and achieve certain error rate control simultaneously. …


Efficient (Soft) Q-Learning For Text Generation With Limited Good Data, Han Guo, Bowen Tan, Zhengzhong Liu, Eric P. Xing, Zhiting Hu Dec 2022

Efficient (Soft) Q-Learning For Text Generation With Limited Good Data, Han Guo, Bowen Tan, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

Machine Learning Faculty Publications

Maximum likelihood estimation (MLE) is the predominant algorithm for training text generation models. This paradigm relies on direct supervision examples, which is not applicable to many emerging applications, such as generating adversarial attacks or generating prompts to control language models. Reinforcement learning (RL) on the other hand offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward. Yet previous RL algorithms for text generation, such as policy gradient (on-policy RL) and Q-learning (off-policy RL), are often notoriously inefficient or unstable to train due to the large sequence space and the sparse reward received only …


Factored Adaptation For Non-Stationary Reinforcement Learning, Fan Feng, Biwei Huang, Kun Zhang, Sara Magliacane Dec 2022

Factored Adaptation For Non-Stationary Reinforcement Learning, Fan Feng, Biwei Huang, Kun Zhang, Sara Magliacane

Machine Learning Faculty Publications

Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of …


Independence Testing-Based Approach To Causal Discovery Under Measurement Error And Linear Non-Gaussian Models, Haoyue Dai, Peter Spirtes, Kun Zhang Dec 2022

Independence Testing-Based Approach To Causal Discovery Under Measurement Error And Linear Non-Gaussian Models, Haoyue Dai, Peter Spirtes, Kun Zhang

Machine Learning Faculty Publications

Causal discovery aims to recover causal structures generating the observational data. Despite its success in certain problems, in many real-world scenarios the observed variables are not the target variables of interest, but the imperfect measures of the target variables. Causal discovery under measurement error aims to recover the causal graph among unobserved target variables from observations made with measurement error. We consider a specific formulation of the problem, where the unobserved target variables follow a linear non-Gaussian acyclic model, and the measurement process follows the random measurement error model. Existing methods on this formulation rely on non-scalable over-complete independent component …


On Pac Learning Halfspaces In Non-Interactive Local Privacy Model With Public Unlabeled Data, Jinyan Su, Jinhui Xu, Di Wang Dec 2022

On Pac Learning Halfspaces In Non-Interactive Local Privacy Model With Public Unlabeled Data, Jinyan Su, Jinhui Xu, Di Wang

Machine Learning Faculty Publications

In this paper, we study the problem of PAC learning halfspaces in the non-interactive local differential privacy model (NLDP). To breach the barrier of exponential sample complexity, previous results studied a relaxed setting where the server has access to some additional public but unlabeled data. We continue in this direction. Specifically, we consider the problem under the standard setting instead of the large margin setting studied before. Under different mild assumptions on the underlying data distribution, we propose two approaches that are based on the Massart noise model and self-supervised learning and show that it is possible to achieve sample …


Rare Gems: Finding Lottery Tickets At Initialization, Kartik Sreenivasan, Jy Yong Sohn, Liu Yang, Matthew Grinde, Alliot Nagle, Hongyi Wang, Eric Xing, Kangwook Lee, Dimitris Papailiopoulos Dec 2022

Rare Gems: Finding Lottery Tickets At Initialization, Kartik Sreenivasan, Jy Yong Sohn, Liu Yang, Matthew Grinde, Alliot Nagle, Hongyi Wang, Eric Xing, Kangwook Lee, Dimitris Papailiopoulos

Machine Learning Faculty Publications

Large neural networks can be pruned to a small fraction of their original size, with little loss in accuracy, by following a time-consuming “train, prune, re-train” approach. Frankle & Carbin [9] conjecture that we can avoid this by training lottery tickets, i.e., special sparse subnetworks found at initialization, that can be trained to high accuracy. However, a subsequent line of work [11, 41] presents concrete evidence that current algorithms for finding trainable networks at initialization, fail simple baseline comparisons, e.g., against training random sparse subnetworks. Finding lottery tickets that train to better accuracy compared to simple baselines remains an open …


Unpaired Image-To-Image Translation With Density Changing Regularization, Shaoan Xie, Qirong Ho, Kun Zhang Dec 2022

Unpaired Image-To-Image Translation With Density Changing Regularization, Shaoan Xie, Qirong Ho, Kun Zhang

Machine Learning Faculty Publications

Unpaired image-to-image translation aims to translate an input image to another domain such that the output image looks like an image from another domain while important semantic information are preserved. Inferring the optimal mapping with unpaired data is impossible without making any assumptions. In this paper, we make a density changing assumption where image patches of high probability density should be mapped to patches of high probability density in another domain. Then we propose an efficient way to enforce this assumption: we train the flows as density estimators and penalize the variance of density changes. Despite its simplicity, our method …


Zeroth-Order Hard-Thresholding: Gradient Error Vs. Expansivity, William De Vazelhes, Hualin Zhang, Huimin Wu, Xiao Tong Yuan, Bin Gu Nov 2022

Zeroth-Order Hard-Thresholding: Gradient Error Vs. Expansivity, William De Vazelhes, Hualin Zhang, Huimin Wu, Xiao Tong Yuan, Bin Gu

Machine Learning Faculty Publications

ℓ0 constrained optimization is prevalent in machine learning, particularly for high-dimensional problems, because it is a fundamental approach to achieve sparse learning. Hard-thresholding gradient descent is a dominant technique to solve this problem. However, first-order gradients of the objective function may be either unavailable or expensive to calculate in a lot of real-world problems, where zeroth-order (ZO) gradients could be a good surrogate. Unfortunately, whether ZO gradients can work with the hard-thresholding operator is still an unsolved problem. To solve this puzzle, in this paper, we focus on the ℓ0 constrained black-box stochastic optimization problems, and propose a new stochastic …


Zeroth-Order Negative Curvature Finding: Escaping Saddle Points Without Gradients, Hualin Zhang, Huan Xiong, Bin Gu Nov 2022

Zeroth-Order Negative Curvature Finding: Escaping Saddle Points Without Gradients, Hualin Zhang, Huan Xiong, Bin Gu

Machine Learning Faculty Publications

We consider escaping saddle points of nonconvex problems where only the function evaluations can be accessed. Although a variety of works have been proposed, the majority of them require either second or first-order information, and only a few of them have exploited zeroth-order methods, particularly the technique of negative curvature finding with zeroth-order methods which has been proven to be the most efficient method for escaping saddle points. To fill this gap, in this paper, we propose two zeroth-order negative curvature finding frameworks that can replace Hessian-vector product computations without increasing the iteration complexity. We apply the proposed frameworks to …


Hyperfast Second-Order Local Solvers For Efficient Statistically Preconditioned Distributed Optimization, Pavel Dvurechensky, Dmitry Kamzolov, Aleksandr Lukashevich, Soomin Lee, Erik Ordentlich, César A. Uribe, Alexander Gasnikov Oct 2022

Hyperfast Second-Order Local Solvers For Efficient Statistically Preconditioned Distributed Optimization, Pavel Dvurechensky, Dmitry Kamzolov, Aleksandr Lukashevich, Soomin Lee, Erik Ordentlich, César A. Uribe, Alexander Gasnikov

Machine Learning Faculty Publications

Statistical preconditioning enables fast methods for distributed large-scale empirical risk minimization problems. In this approach, multiple worker nodes compute gradients in parallel, which are then used by the central node to update the parameter by solving an auxiliary (preconditioned) smaller-scale optimization problem. The recently proposed Statistically Preconditioned Accelerated Gradient (SPAG) method [1] has complexity bounds superior to other such algorithms but requires an exact solution for computationally intensive auxiliary optimization problems at every iteration. In this paper, we propose an Inexact SPAG (InSPAG) and explicitly characterize the accuracy by which the corresponding auxiliary subproblem needs to be solved to guarantee …