Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Computer Engineering

Short-Term Bus Passenger Flow Prediction Based On Convolutional Long-Short-Term Memory Network, Jing Chen, Zhaochong Zhang, Linkai Wang, Mai An, Wei Wang Feb 2024

Short-Term Bus Passenger Flow Prediction Based On Convolutional Long-Short-Term Memory Network, Jing Chen, Zhaochong Zhang, Linkai Wang, Mai An, Wei Wang

Journal of System Simulation

Abstract: To address the problem that the traditional short-time passenger flow prediction method does not consider the temporal characteristics similarity between the inter-temporal passenger flows, a shorttime passenger flow prediction model k-CNN-LSTM is proposed by combining the improved k-means clustering algorithm with the CNN and the LSTM. The k-means is used to cluster the intertemporal timeseries data, the k-value is determined by using the gap-statistic, and a traffic flow matrix model is constructed. A CNN-LSTM network is used to process the short-time passenger flows with spatial and temporal characteristics. The model is tested and parameter tuned by the real dataset. …


Multimodal Fusion For Audio-Image And Video Action Recognition, Muhammad B. Shaikh, Douglas Chai, Syed M. S. Islam, Naveed Akhtar Jan 2024

Multimodal Fusion For Audio-Image And Video Action Recognition, Muhammad B. Shaikh, Douglas Chai, Syed M. S. Islam, Naveed Akhtar

Research outputs 2022 to 2026

Multimodal Human Action Recognition (MHAR) is an important research topic in computer vision and event recognition fields. In this work, we address the problem of MHAR by developing a novel audio-image and video fusion-based deep learning framework that we call Multimodal Audio-Image and Video Action Recognizer (MAiVAR). We extract temporal information using image representations of audio signals and spatial information from video modality with the help of Convolutional Neutral Networks (CNN)-based feature extractors and fuse these features to recognize respective action classes. We apply a high-level weights assignment algorithm for improving audio-visual interaction and convergence. This proposed fusion-based framework utilizes …