Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Cvfnet: Real-Time 3d Object Detection By Learning Cross View Features, Jiaqi Gu, Zhiyu Xiang, Pan Zhao, Tingming Bai, Lingxuan Wang, Xijun Zhao, Zhiyuan Zhang Oct 2022

Cvfnet: Real-Time 3d Object Detection By Learning Cross View Features, Jiaqi Gu, Zhiyu Xiang, Pan Zhao, Tingming Bai, Lingxuan Wang, Xijun Zhao, Zhiyuan Zhang

Research Collection School Of Computing and Information Systems

In recent years 3D object detection from LiDAR point clouds has made great progress thanks to the development of deep learning technologies. Although voxel or point based methods are popular in 3D object detection, they usually involve time-consuming operations such as 3D convolutions on voxels or ball query among points, making the resulting network inappropriate for time critical applications. On the other hand, 2D view-based methods feature high computing efficiency while usually obtaining inferior performance than the voxel or point based methods. In this work, we present a real-time view-based single stage 3D object detector, namely CVFNet to fulfill this …


Self-Supervised Video Representation Learning By Uncovering Spatio-Temporal Statistics, Jiangliu Wang, Jianbo Jiao, Linchao Bao, Shengfeng He, Wei Liu, Yun-Hui Liu Jul 2022

Self-Supervised Video Representation Learning By Uncovering Spatio-Temporal Statistics, Jiangliu Wang, Jianbo Jiao, Linchao Bao, Shengfeng He, Wei Liu, Yun-Hui Liu

Research Collection School Of Computing and Information Systems

This paper proposes a novel pretext task to address the self-supervised video representation learning problem. Specifically, given an unlabeled video clip, we compute a series of spatio-temporal statistical summaries, such as the spatial location and dominant direction of the largest motion, the spatial location and dominant color of the largest color diversity along the temporal axis, etc. Then a neural network is built and trained to yield the statistical summaries given the video frames as inputs. In order to alleviate the learning difficulty, we employ several spatial partitioning patterns to encode rough spatial locations instead of exact spatial Cartesian coordinates. …


Deep Learning For Anomaly Detection, Guansong Pang, Charu Aggarwal, Chunhua Shen, Nicu Sebe Jun 2022

Deep Learning For Anomaly Detection, Guansong Pang, Charu Aggarwal, Chunhua Shen, Nicu Sebe

Research Collection School Of Computing and Information Systems

A nomaly detection aims at identifying data points which are rare or significantly different from the majority of data points. Many techniques are explored to build highly efficient and effective anomaly detection systems, but they are confronted with many difficulties when dealing with complex data, such as failing to capture intricate feature interactions or extract good feature representations. Deep-learning techniques have shown very promising performance in tackling different types of complex data in a broad range of tasks/problems, including anomaly detection. To address this new trend, we organized this Special Issue on Deep Learning for Anomaly Detection to cover the …


Simple Or Complex? Together For A More Accurate Just-In-Time Defect Predictor, Xin Zhou, Donggyun Han, David Lo May 2022

Simple Or Complex? Together For A More Accurate Just-In-Time Defect Predictor, Xin Zhou, Donggyun Han, David Lo

Research Collection School Of Computing and Information Systems

Just-In-Time (JIT) defect prediction aims to automatically predict whether a commit is defective or not, and has been widely studied in recent years. In general, most studies can be classified into two categories: 1) simple models using traditional machine learning classifiers with hand-crafted features, and 2) complex models using deep learning techniques to automatically extract features. Hand-crafted features used by simple models are based on expert knowledge but may not fully represent the semantic meaning of the commits. On the other hand, deep learning-based features used by complex models represent the semantic meaning of commits but may not reflect useful …


Action-Centric Relation Transformer Network For Video Question Answering, Jipeng Zhang, Jie Shao, Rui Cao, Lianli Gao, Xing Xu, Heng Tao Shen Jan 2022

Action-Centric Relation Transformer Network For Video Question Answering, Jipeng Zhang, Jie Shao, Rui Cao, Lianli Gao, Xing Xu, Heng Tao Shen

Research Collection School Of Computing and Information Systems

Video question answering (VideoQA) has emerged as a popular research topic in recent years. Enormous efforts have been devoted to developing more effective fusion strategies and better intra-modal feature preparation. To explore these issues further, we identify two key problems. (1) Current works take almost no account of introducing action of interest in video representation. Additionally, there exists insufficient labeling data on where the action of interest is in many datasets. However, questions in VideoQA are usually action-centric. (2) Frame-to-frame relations, which can provide useful temporal attributes (e.g., state transition, action counting), lack relevant research. Based on these observations, we …