Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Research Collection School Of Computing and Information Systems

2022

Representation learning

Articles 1 - 6 of 6

Full-Text Articles in Computer Sciences

Cvfnet: Real-Time 3d Object Detection By Learning Cross View Features, Jiaqi Gu, Zhiyu Xiang, Pan Zhao, Tingming Bai, Lingxuan Wang, Xijun Zhao, Zhiyuan Zhang Oct 2022

Cvfnet: Real-Time 3d Object Detection By Learning Cross View Features, Jiaqi Gu, Zhiyu Xiang, Pan Zhao, Tingming Bai, Lingxuan Wang, Xijun Zhao, Zhiyuan Zhang

Research Collection School Of Computing and Information Systems

In recent years 3D object detection from LiDAR point clouds has made great progress thanks to the development of deep learning technologies. Although voxel or point based methods are popular in 3D object detection, they usually involve time-consuming operations such as 3D convolutions on voxels or ball query among points, making the resulting network inappropriate for time critical applications. On the other hand, 2D view-based methods feature high computing efficiency while usually obtaining inferior performance than the voxel or point based methods. In this work, we present a real-time view-based single stage 3D object detector, namely CVFNet to fulfill this …


Self-Supervised Video Representation Learning By Uncovering Spatio-Temporal Statistics, Jiangliu Wang, Jianbo Jiao, Linchao Bao, Shengfeng He, Wei Liu, Yun-Hui Liu Jul 2022

Self-Supervised Video Representation Learning By Uncovering Spatio-Temporal Statistics, Jiangliu Wang, Jianbo Jiao, Linchao Bao, Shengfeng He, Wei Liu, Yun-Hui Liu

Research Collection School Of Computing and Information Systems

This paper proposes a novel pretext task to address the self-supervised video representation learning problem. Specifically, given an unlabeled video clip, we compute a series of spatio-temporal statistical summaries, such as the spatial location and dominant direction of the largest motion, the spatial location and dominant color of the largest color diversity along the temporal axis, etc. Then a neural network is built and trained to yield the statistical summaries given the video frames as inputs. In order to alleviate the learning difficulty, we employ several spatial partitioning patterns to encode rough spatial locations instead of exact spatial Cartesian coordinates. …


A Simple Data Mixing Prior For Improving Self-Supervised Learning, Sucheng Ren, Huiyu Wang, Zhengqi Gao, Shengfeng He, Alan Yuille, Yuyin Zhou, Cihang Xie Jun 2022

A Simple Data Mixing Prior For Improving Self-Supervised Learning, Sucheng Ren, Huiyu Wang, Zhengqi Gao, Shengfeng He, Alan Yuille, Yuyin Zhou, Cihang Xie

Research Collection School Of Computing and Information Systems

Data mixing (e.g., Mixup, Cutmix, ResizeMix) is an essential component for advancing recognition models. In this paper, we focus on studying its effectiveness in the self-supervised setting. By noticing the mixed images that share the same source images are intrinsically related to each other, we hereby propose SDMP, short for Simple Data Mixing Prior, to capture this straightforward yet essential prior, and position such mixed images as additional positive pairs to facilitate self-supervised representation learning. Our experiments verify that the proposed SDMP enables data mixing to help a set of self-supervised learning frameworks (e.g., MoCo) achieve better accuracy and out-of-distribution …


Co-Advise: Cross Inductive Bias Distillation, Sucheng Ren, Zhengqi Gao, Tiany Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao Jun 2022

Co-Advise: Cross Inductive Bias Distillation, Sucheng Ren, Zhengqi Gao, Tiany Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao

Research Collection School Of Computing and Information Systems

The inductive bias of vision transformers is more relaxed that cannot work well with insufficient data. Knowledge distillation is thus introduced to assist the training of transformers. Unlike previous works, where merely heavy convolution-based teachers are provided, in this paper, we delve into the influence of models inductive biases in knowledge distillation (e.g., convolution and involution). Our key observation is that the teacher accuracy is not the dominant reason for the student accuracy, but the teacher inductive bias is more important. We demonstrate that lightweight teachers with different architectural inductive biases can be used to co-advise the student transformer with …


Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Wu Aoyu, Huan Wei, Huamin Qu May 2022

Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Wu Aoyu, Huan Wei, Huamin Qu

Research Collection School Of Computing and Information Systems

With the wide usage of data visualizations, a huge number of Scalable Vector Graphic (SVG)-based visualizations have been created and shared online. Accordingly, there has been an increasing interest in exploring how to retrieve perceptually similar visualizations from a large corpus, since it can beneft various downstream applications such as visualization recommendation. Existing methods mainly focus on the visual appearance of visualizations by regarding them as bitmap images. However, the structural information intrinsically existing in SVG-based visualizations is ignored. Such structural information can delineate the spatial and hierarchical relationship among visual elements, and characterize visualizations thoroughly from a new perspective. …


Cost: Contrastive Learning Of Disentangled Seasonal-Trend Representations For Time Series Forecasting, Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi Apr 2022

Cost: Contrastive Learning Of Disentangled Seasonal-Trend Representations For Time Series Forecasting, Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

Research Collection School Of Computing and Information Systems

Deep learning has been actively studied for time series forecasting, and the mainstream paradigm is based on the end-to-end training of neural network architectures, ranging from classical LSTM/RNNs to more recent TCNs and Transformers. Motivated by the recent success of representation learning in computer vision and natural language processing, we argue that a more promising paradigm for time series forecasting, is to first learn disentangled feature representations, followed by a simple regression fine-tuning step – we justify such a paradigm from a causal perspective. Following this principle, we propose a new time series representation learning framework for long sequence time …