Graphics and Human Computer Interfaces | Open Access Articles

Catnet: Cross-Modal Fusion For Audio-Visual Speech Recognition, Xingmei Wang, Jianchen Mi, Boquan Li, Yixu Zhao, Jiaxiang Meng

Research Collection School Of Computing and Information Systems

Automatic speech recognition (ASR) is a typical pattern recognition technology that converts human speeches into texts. With the aid of advanced deep learning models, the performance of speech recognition is significantly improved. Especially, the emerging Audio–Visual Speech Recognition (AVSR) methods achieve satisfactory performance by combining audio-modal and visual-modal information. However, various complex environments, especially noises, limit the effectiveness of existing methods. In response to the noisy problem, in this paper, we propose a novel cross-modal audio–visual speech recognition model, named CATNet. First, we devise a cross-modal bidirectional fusion model to analyze the close relationship between audio and visual modalities. Second, …

Full-Text Articles in Graphics and Human Computer Interfaces

Catnet: Cross-Modal Fusion For Audio-Visual Speech Recognition, Xingmei Wang, Jianchen Mi, Boquan Li, Yixu Zhao, Jiaxiang Meng

Research Collection School Of Computing and Information Systems

Tracking People Across Ultra Populated Indoor Spaces By Matching Unreliable Wi-Fi Signals With Disconnected Video Feeds, Quang Hai Truong, Dheryta Jaisinghani, Shubham Jain, Arunesh Sinha, Jeong Gil Ko, Rajesh Krishna Balan

Research Collection School Of Computing and Information Systems

Efficient Unsupervised Video Hashing With Contextual Modeling And Structural Controlling, Jingru Duan, Yanbin Hao, Bin Zhu, Lechao Cheng, Pengyuan Zhou, Xiang Wang

Research Collection School Of Computing and Information Systems

Glance To Count: Learning To Rank With Anchors For Weakly-Supervised Crowd Counting, Zheng Xiong, Liangyu Chai, Wenxi Liu, Yongtuo Liu, Sucheng Ren, Shengfeng He

Research Collection School Of Computing and Information Systems

Constructing Holistic Spatio-Temporal Scene Graph For Video Semantic Role Labeling, Yu Zhao, Hao Fei, Yixin Cao, Bobo Li, Meishan Zhang, Jianguo Wei, Min Zhang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Npf-200: A Multi-Modal Eye Fixation Dataset And Method For Non-Photorealistic Videos, Ziyu Yang, Sucheng Ren, Zongwei Wu, Nanxuan Zhao, Junle Wang, Jing Qin, Shengfeng He

Research Collection School Of Computing and Information Systems

Matk: The Meme Analytical Tool Kit, Ming Shan Hee, Aditi Kumaresan, Nguyen Khoi Hoang, Nirmalendu Prakash, Rui Cao, Roy Ka-Wei Lee

Research Collection School Of Computing and Information Systems

Underwater Image Translation Via Multi-Scale Generative Adversarial Network, Dongmei Yang, Tianzi Zhang, Boquan Li, Menghao Li, Weijing Chen, Xiaoqing Li, Xingmei Wang

Research Collection School Of Computing and Information Systems

Adavis: Adaptive And Explainable Visualization Recommendation For Tabular Data, Songheng Zhang, Yong Wang, Haotian Li, Huamin Qu

Research Collection School Of Computing and Information Systems

Gnnlens: A Visual Analytics Approach For Prediction Error Diagnosis Of Graph Neural Networks., Zhihua Jin, Yong Wang, Qianwen Wang, Yao Ming, Tengfei Ma, Huamin Qu

Research Collection School Of Computing and Information Systems

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Daot: Domain-Agnostically Aligned Optimal Transport For Domain-Adaptive Crowd Counting, Huilin Zhu, Jingling Yuan, Xian Zhong, Zhengwei Yang, Zheng Wang, Shengfeng He

Research Collection School Of Computing and Information Systems

Equivariance And Invariance Inductive Bias For Learning From Insufficient Data, Tan Wang, Qianru Sun, Sugiri Pranata, Karlekar Jayashree, Hanwang Zhang

Research Collection School Of Computing and Information Systems

A Large-Scale Benchmark For Food Image Segmentation, Xiongwei Wu, Xin Fu, Ying Liu, Ee-Peng Lim, Steven C. H. Hoi, Qianru Sun

Research Collection School Of Computing and Information Systems

Delving Deep Into Many-To-Many Attention For Few-Shot Video Object Segmentation, Haoxin Chen, Hanjie Wu, Nanxuan Zhao, Sucheng Ren, Shengfeng He

Research Collection School Of Computing and Information Systems

Characterizing Students’ Engineering Design Strategies Using Energy3d, Jasmine Singh, Viranga Perera, Alejandra Magana, Brittany Newell

Discovery Undergraduate Interdisciplinary Research Internship

Espade: An Efficient And Semantically Secure Shortest Path Discovery For Outsourced Location-Based Services, Bharath K. Samanthula, Divyadharshini Karthikeyan, Boxiang Dong, K. Anitha Kumari

Department of Computer Science Faculty Scholarship and Creative Works

Storage Management Strategy In Mobile Phones For Photo Crowdsensing, En Wang, Zhengdao Qu, Xinyao Liang, Xiangyu Meng, Yongjian Yang, Dawei Li, Weibin Meng

Department of Computer Science Faculty Scholarship and Creative Works

Gender And Racial Diversity In Commercial Brands' Advertising Images On Social Media, Jisun An, Haewoon Kwak

Research Collection School Of Computing and Information Systems

Ancr—An Adaptive Network Coding Routing Scheme For Wsns With Different-Success-Rate Links †, Xiang Ji, Anwen Wang, Chunyu Li, Chun Ma, Yao Peng, Dajin Wang, Qingyi Hua, Feng Chen, Dingyi Fang

Department of Computer Science Faculty Scholarship and Creative Works

Spica: Stereographic Projection For Interactive Crystallographic Analysis, Xingzhong Li

Nebraska Center for Materials and Nanoscience: Faculty Publications

An Immersive Telepresence System Using Rgb-D Sensors And Head-Mounted Display, Xinzhong Lu, Ju Shen, Saverio Perugini, Jianjun Yang

Computer Science Faculty Publications

Automatic Video Self Modeling For Voice Disorder, Ju Shen, Changpeng Ti, Anusha Raghunathan, Sen-Ching S. Cheung, Rita Patel

Computer Science Faculty Publications

Compression Of Video Tracking And Bandwidth Balancing Routing In Wireless Multimedia Sensor Networks, Yin Wang, Jianjun Yang, Ju Shen, Bryson Payne, Juan Guo, Kun Hua

Computer Science Faculty Publications

Leading Undergraduate Students To Big Data Generation, Jianjun Yang, Ju Shen

Computer Science Faculty Publications

Person Identification From Streaming Surveillance Video Using Mid-Level Features From Joint Action-Pose Distribution, Binu M. Nair, Vijayan K. Asari

Electrical and Computer Engineering Faculty Publications

Hole Detection And Shape-Free Representation And Double Landmarks Based Geographic Routing In Wireless Sensor Networks, Jianjun Yang, Zongming Fei, Ju Shen

Computer Science Faculty Publications

Seeing Human Weight From A Single Rgb-D Image, Tam Nguyen, Jiashi Feng, Shuicheng Yan

Computer Science Faculty Publications

Structure Preserving Large Imagery Reconstruction, Ju Shen, Jianjun Yang, Sami Taha Abu Sneineh, Bryson Payne, Markus Hitz

Computer Science Faculty Publications

Automatic Objects Removal For Scene Completion, Jianjun Yang, Yin Wang, Honggang Wang, Kun Hua, Wei Wang, Ju Shen

Computer Science Faculty Publications