Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Research Collection School Of Computing and Information Systems

Series

2024

Deep learning

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Dronlomaly: Runtime Log-Based Anomaly Detector For Dji Drones, Wei Minn, Naing Tun Yan, Lwin Khin Shar, Lingxiao Jiang Apr 2024

Dronlomaly: Runtime Log-Based Anomaly Detector For Dji Drones, Wei Minn, Naing Tun Yan, Lwin Khin Shar, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

We present an automated tool for realtime detection of anomalous behaviors while a DJI drone is executing a flight mission. The tool takes sensor data logged by drone at fixed time intervals and performs anomaly detection using a Bi-LSTM model. The model is trained on baseline flight logs from a successful mission physically or via a simulator. The tool has two modules --- the first module is responsible for sending the log data to the remote controller station, and the second module is run as a service in the remote controller station powered by a Bi-LSTM model, which receives the …


Catnet: Cross-Modal Fusion For Audio-Visual Speech Recognition, Xingmei Wang, Jianchen Mi, Boquan Li, Yixu Zhao, Jiaxiang Meng Feb 2024

Catnet: Cross-Modal Fusion For Audio-Visual Speech Recognition, Xingmei Wang, Jianchen Mi, Boquan Li, Yixu Zhao, Jiaxiang Meng

Research Collection School Of Computing and Information Systems

Automatic speech recognition (ASR) is a typical pattern recognition technology that converts human speeches into texts. With the aid of advanced deep learning models, the performance of speech recognition is significantly improved. Especially, the emerging Audio–Visual Speech Recognition (AVSR) methods achieve satisfactory performance by combining audio-modal and visual-modal information. However, various complex environments, especially noises, limit the effectiveness of existing methods. In response to the noisy problem, in this paper, we propose a novel cross-modal audio–visual speech recognition model, named CATNet. First, we devise a cross-modal bidirectional fusion model to analyze the close relationship between audio and visual modalities. Second, …