Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Graphics and Human Computer Interfaces

PDF

Singapore Management University

Neural networks

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Comai: Enabling Lightweight, Collaborative Intelligence By Retrofitting Vision Dnns, Kasthuri Jayarajah, Dhanuja Wanniarachchige, Tarek Abdelzaher, Archan Misra Apr 2022

Comai: Enabling Lightweight, Collaborative Intelligence By Retrofitting Vision Dnns, Kasthuri Jayarajah, Dhanuja Wanniarachchige, Tarek Abdelzaher, Archan Misra

Research Collection School Of Computing and Information Systems

While Deep Neural Network (DNN) models have transformed machine vision capabilities, their extremely high computational complexity and model sizes present a formidable deployment roadblock for AIoT applications. We show that the complexity-vs-accuracy-vs-communication tradeoffs for such DNN models can be significantly addressed via a novel, lightweight form of “collaborative machine intelligence” that requires only runtime changes to the inference process. In our proposed approach, called ComAI, the DNN pipelines of different vision sensors share intermediate processing state with one another, effectively providing hints about objects located within their mutually-overlapping Field-of-Views (FoVs). CoMAI uses two novel techniques: (a) a secondary shallow ML …


Deep Learning For Video-Grounded Dialogue Systems, Hung Le Jan 2022

Deep Learning For Video-Grounded Dialogue Systems, Hung Le

Dissertations and Theses Collection (Open Access)

In recent years, we have witnessed significant progress in building systems with artificial intelligence. However, despite advancements in machine learning and deep learning, we are still far from achieving autonomous agents that can perceive multi-dimensional information from the surrounding world and converse with humans in natural language. Towards this goal, this thesis is dedicated to building intelligent systems in the task of video-grounded dialogues. Specifically, in a video-grounded dialogue, a system is required to hold a multi-turn conversation with humans about the content of a video. Given an input video, a dialogue history, and a question about the video, the …


Neighbourhood Structure Preserving Cross-Modal Embedding For Video Hyperlinking, Yanbin Hao, Chong-Wah Ngo, Benoit Huet Jan 2020

Neighbourhood Structure Preserving Cross-Modal Embedding For Video Hyperlinking, Yanbin Hao, Chong-Wah Ngo, Benoit Huet

Research Collection School Of Computing and Information Systems

Video hyperlinking is a task aiming to enhance the accessibility of large archives, by establishing links between fragments of videos. The links model the aboutness between fragments for efficient traversal of video content. This paper addresses the problem of link construction from the perspective of cross-modal embedding. To this end, a generalized multi-modal auto-encoder is proposed.& x00A0;The encoder learns two embeddings from visual and speech modalities, respectively, whereas each of the embeddings performs self-modal and cross-modal translation of modalities. Furthermore, to preserve the neighbourhood structure of fragments, which is important for video hyperlinking, the auto-encoder is devised to model data …


Rotation Invariant Convolutions For 3d Point Clouds Deep Learning, Zhiyuan Zhang, Binh-Son Hua, David W. Rosen, Sai-Kit Yeung Sep 2019

Rotation Invariant Convolutions For 3d Point Clouds Deep Learning, Zhiyuan Zhang, Binh-Son Hua, David W. Rosen, Sai-Kit Yeung

Research Collection School Of Computing and Information Systems

Recent progresses in 3D deep learning has shown that it is possible to design special convolution operators to consume point cloud data. However, a typical drawback is that rotation invariance is often not guaranteed, resulting in networks that generalizes poorly to arbitrary rotations. In this paper, we introduce a novel convolution operator for point clouds that achieves rotation invariance. Our core idea is to use low-level rotation invariant geometric features such as distances and angles to design a convolution operator for point cloud learning. The well-known point ordering problem is also addressed by a binning approach seamlessly built into the …


Deep Learning On Lie Groups For Skeleton-Based Action Recognition, Zhiwu Huang, C. Wan, T. Probst, Gool L. Van Jul 2017

Deep Learning On Lie Groups For Skeleton-Based Action Recognition, Zhiwu Huang, C. Wan, T. Probst, Gool L. Van

Research Collection School Of Computing and Information Systems

In recent years, skeleton-based action recognition has become a popular 3D classification problem. State-of-the-art methods typically first represent each motion sequence as a high-dimensional trajectory on a Lie group with an additional dynamic time warping, and then shallowly learn favorable Lie group features. In this paper we incorporate the Lie group structure into a deep network architecture to learn more appropriate Lie group features for 3D action recognition. Within the network structure, we design rotation mapping layers to transform the input Lie group features into desirable ones, which are aligned better in the temporal domain. To reduce the high feature …