Physical Sciences and Mathematics | Open Access Articles

Multimodal Fusion For Audio-Image And Video Action Recognition, Muhammad B. Shaikh, Douglas Chai, Syed M. S. Islam, Naveed Akhtar Jan 2024

Multimodal Fusion For Audio-Image And Video Action Recognition, Muhammad B. Shaikh, Douglas Chai, Syed M. S. Islam, Naveed Akhtar

Research outputs 2022 to 2026

Multimodal Human Action Recognition (MHAR) is an important research topic in computer vision and event recognition fields. In this work, we address the problem of MHAR by developing a novel audio-image and video fusion-based deep learning framework that we call Multimodal Audio-Image and Video Action Recognizer (MAiVAR). We extract temporal information using image representations of audio signals and spatial information from video modality with the help of Convolutional Neutral Networks (CNN)-based feature extractors and fuse these features to recognize respective action classes. We apply a high-level weights assignment algorithm for improving audio-visual interaction and convergence. This proposed fusion-based framework utilizes …

Go to article

Multi-View Human Action Recognition Based On Deep Neural Network, Zhao Ying, Lu Yao, Zhang Jian, Qidi Liang, Long Wei Jun 2021

Multi-View Human Action Recognition Based On Deep Neural Network, Zhao Ying, Lu Yao, Zhang Jian, Qidi Liang, Long Wei

Journal of System Simulation

Abstract: A novel deep neural network named CNN+CA(Convolutional Neural Network plus Context Attention) model is constructed and a new recognition algorithm based on sequence matching is presented to improve the recognition accuracy of MVHAR (Multi-view Human Action Recognition). A CNN(Convolutional Neural Network) is designed to automatically learn multi-view fusion features; the CA (Context Attention) module is introduced to selectively focus on the parts of the features that are relevant for the recognition task; the proposed recognition algorithm based on sequence matching is used to realize MVHAR. The experimental results on the IXMAS dataset and the i3DPost dataset …

Go to article

An Adversarial Framework For Open-Set Human Action Recognition Usingskeleton Data, Özge Özti̇mur Karadağ Jan 2021

An Adversarial Framework For Open-Set Human Action Recognition Usingskeleton Data, Özge Özti̇mur Karadağ

Turkish Journal of Electrical Engineering and Computer Sciences

Human action recognition is a fundamental problem which is applied in various domains, and it is widelystudied in the literature. Majority of the studies model action recognition as a closed-set problem. However, in real-life applications it usually arises as an open-set problem where a set of actions are not available during training butare introduced to the system during testing. In this study, we propose an open-set action recognition system, humanaction recognition and novel action detection system (HARNAD), which consists of two stages and uses only 3D skeletoninformation. In the first stage, HARNAD recognizes a given action and in the second …

Go to article

Human Action Recognition Method Based On Key Frames, Xiangbin Shi, Shuanpeng Liu, Deyuan Zhang Aug 2020

Human Action Recognition Method Based On Key Frames, Xiangbin Shi, Shuanpeng Liu, Deyuan Zhang

Journal of System Simulation

Abstract: More and more researchers have begun to study the human action recognition based on depth information and skeleton information since the Kinect has been released. A method of human action recognition based on the skeleton feature of key frames is proposed in order to improve the accuracy and timeliness of the human action recognition, and reduce the computational complexity. The clustered data was obtained by using K-means clustering algorithm, and then the key frames were extracted by using the clustered data. Two features for human action recognition were extracted, one is the feature of the position of human joint, …

Go to article

Fusing Local And Global Features For Human Action Recognition, Tang Chao, Miaohui Zhang, Li Wei, Cao Feng, Xiaofeng Wang, Xiaohong Tong Jan 2019

Fusing Local And Global Features For Human Action Recognition, Tang Chao, Miaohui Zhang, Li Wei, Cao Feng, Xiaofeng Wang, Xiaohong Tong

Journal of System Simulation

Abstract: Recognizing human actions according to video features is an important research topic in a wide scope of applications. In this paper, we propose a robust human motion detection method that combines canny operator with the combination of local and global optic flow methods. Meanwhile, this paper presents a simple but efficient action recognition algorithm using fusion visual features. The mixed features fuse two action descriptors, namely centre distance-based space time interest point and curvature function-based Fourier descriptors. The frame-based human action classifier is developed using random forests algorithm. Experimental results show that the proposed method is accurate, efficient and …

Go to article

Human Action Recognition Based On Depth Image, Tang Chao, Miaohui Zhang, Li Wei, Cao Feng, Xiaofeng Wang, Xiaohong Tong Jan 2019

Human Action Recognition Based On Depth Image, Tang Chao, Miaohui Zhang, Li Wei, Cao Feng, Xiaofeng Wang, Xiaohong Tong

Journal of System Simulation

Abstract: Because of the complexity and non-rigidity of human actions, traditional human action recognition based on RGB video data is a very challenging research topic. According to some deficiencies of existing recognition method based on RGB video data, a novel human action recognition method is proposed based on depth image data. In this new method, the block mean feature in the depth difference motion historical image is fused with the Gabor feature as mixed features and then a rotation forest algorithm is used to model. The experimental results show that the proposed method is simple, fast and efficient compared …

Go to article

Human Action Classification Based On Sequential Bag-Of-Words Model, Hong Liu, Qiaoduo Zhang, Qianru Sun Dec 2014

Human Action Classification Based On Sequential Bag-Of-Words Model, Hong Liu, Qiaoduo Zhang, Qianru Sun

Research Collection School Of Computing and Information Systems

Recently, approaches utilizing spatial-temporal features have achieved great success in human action classification. However, they typically rely on bag-of-words (BoWs) model, and ignore the spatial and temporal structure information of visual words, bringing ambiguities among similar actions. In this paper, we present a novel approach called sequential BoWs for efficient human action classification. It captures temporal sequential structure by segmenting the entire action into sub-actions. Each sub-action has a tiny movement within a narrow range of action. Then the sequential BoWs are created, in which each sub-action is assigned with a certain weight and salience to highlight the distinguishing sections. …

Go to article

Action Classification By Exploring Directional Co-Occurrence Of Weighted Stips, Mengyuan Liu, Hong Liu, Qianru Sun Oct 2014

Action Classification By Exploring Directional Co-Occurrence Of Weighted Stips, Mengyuan Liu, Hong Liu, Qianru Sun

Research Collection School Of Computing and Information Systems

Human action recognition is challenging mainly due to intro-variety, inter-ambiguity and clutter backgrounds in real videos. Bag-of-visual words model utilizes spatio-temporal interest points(STIPs), and represents action by the distribution of points which ignores visual context among points. To add more contextual information, we propose a method by encoding spatio-temporal distribution of weighted pairwise points. First, STIPs are extracted from an action sequence and clustered into visual words. Then, each word is weighted in both temporal and spatial domains to capture the relationships with other words. Finally, the directional relationships between co-occurrence pairwise words are used to encode visual contexts. We …

Go to article

Learning Directional Co-Occurrence For Human Action Classification, Hong Liu, Mengyuan Liu, Qianru Sun May 2014

Learning Directional Co-Occurrence For Human Action Classification, Hong Liu, Mengyuan Liu, Qianru Sun

Research Collection School Of Computing and Information Systems

Spatio-temporal interest point (STIP) based methods have shown promising results for human action classification. However, state-of-art works typically utilize bag-of-visual words (BoVW), which focuses on the statistical distribution of features but ignores their inherent structural relationships. To solve this problem, a descriptor, namely directional pair-wise feature (DPF), is proposed to encode the mutual direction information between pairwise words, aiming at adding more spatial discriminant to BoVW. Firstly, STIP features are extracted and classified into a set of labeled words. Then in each frame, the DPF is constructed for every pair of words with different labels, according to their assigned directional …

Go to article

Learning Spatio-Temporal Co-Occurrence Correlograms For Efficient Human Action Classification, Qianru Sun, Hong Liu Sep 2013

Learning Spatio-Temporal Co-Occurrence Correlograms For Efficient Human Action Classification, Qianru Sun, Hong Liu

Research Collection School Of Computing and Information Systems

Spatio-temporal interest point (STIP) based features show great promises in human action analysis with high efficiency and robustness. However, they typically focus on bag-of-visual words (BoVW), which omits any correlation among words and shows limited discrimination in real-world videos. In this paper, we propose a novel approach to add the spatio-temporal co-occurrence relationships of visual words to BoVW for a richer representation. Rather than assigning a particular scale on videos, we adopt the normalized google-like distance (NGLD) to measure the words' co-occurrence semantics, which grasps the videos' structure information in a statistical way. All pairwise distances in spatial and temporal …

Go to article

Human Action Recognition Via Fused Kinematic Structure And Surface Representation, Salah R. Althloothi Aug 2013

Human Action Recognition Via Fused Kinematic Structure And Surface Representation, Salah R. Althloothi

Electronic Theses and Dissertations

Human action recognition from visual data has remained a challenging problem in the field of computer vision and pattern recognition. This dissertation introduces a new methodology for human action recognition using motion features extracted from kinematic structure, and shape features extracted from surface representation of human body. Motion features are used to provide sufficient information about human movement, whereas shape features are used to describe the structure of silhouette. These features are fused at the kernel level using Multikernel Learning (MKL) technique to enhance the overall performance of human action recognition. In fact, there are advantages in using multiple types …

Go to article

Action Disambiguation Analysis Using Normalized Google-Like Distance Correlogram, Qianru Sun, Hong Liu Nov 2012

Action Disambiguation Analysis Using Normalized Google-Like Distance Correlogram, Qianru Sun, Hong Liu

Research Collection School Of Computing and Information Systems

Classifying realistic human actions in video remains challenging for existing intro-variability and inter-ambiguity in action classes. Recently, Spatial-Temporal Interest Point (STIP) based local features have shown great promise in complex action analysis. However, these methods have the limitation that they typically focus on Bag-of-Words (BoW) algorithm, which can hardly discriminate actions’ ambiguity due to ignoring of spatial-temporal occurrence relations of visual words. In this paper, we propose a new model to capture this contextual relationship in terms of pairwise features’ co-occurrence. Normalized Google-Like Distance (NGLD) is proposed to numerically measuring this co-occurrence, due to its effectiveness in semantic correlation analysis. …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Multimodal Fusion For Audio-Image And Video Action Recognition, Muhammad B. Shaikh, Douglas Chai, Syed M. S. Islam, Naveed Akhtar

Research outputs 2022 to 2026

Multi-View Human Action Recognition Based On Deep Neural Network, Zhao Ying, Lu Yao, Zhang Jian, Qidi Liang, Long Wei

Journal of System Simulation

An Adversarial Framework For Open-Set Human Action Recognition Usingskeleton Data, Özge Özti̇mur Karadağ

Turkish Journal of Electrical Engineering and Computer Sciences

Human Action Recognition Method Based On Key Frames, Xiangbin Shi, Shuanpeng Liu, Deyuan Zhang

Journal of System Simulation

Fusing Local And Global Features For Human Action Recognition, Tang Chao, Miaohui Zhang, Li Wei, Cao Feng, Xiaofeng Wang, Xiaohong Tong

Journal of System Simulation

Human Action Recognition Based On Depth Image, Tang Chao, Miaohui Zhang, Li Wei, Cao Feng, Xiaofeng Wang, Xiaohong Tong

Journal of System Simulation

Human Action Classification Based On Sequential Bag-Of-Words Model, Hong Liu, Qiaoduo Zhang, Qianru Sun

Research Collection School Of Computing and Information Systems

Action Classification By Exploring Directional Co-Occurrence Of Weighted Stips, Mengyuan Liu, Hong Liu, Qianru Sun

Research Collection School Of Computing and Information Systems

Learning Directional Co-Occurrence For Human Action Classification, Hong Liu, Mengyuan Liu, Qianru Sun

Research Collection School Of Computing and Information Systems

Learning Spatio-Temporal Co-Occurrence Correlograms For Efficient Human Action Classification, Qianru Sun, Hong Liu

Research Collection School Of Computing and Information Systems

Human Action Recognition Via Fused Kinematic Structure And Surface Representation, Salah R. Althloothi

Electronic Theses and Dissertations

Action Disambiguation Analysis Using Normalized Google-Like Distance Correlogram, Qianru Sun, Hong Liu

Research Collection School Of Computing and Information Systems