Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

City University of New York (CUNY)

Theses/Dissertations

2024

Computer Vision

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang Jun 2024

Context In Computer Vision: A Taxonomy, Multi-Stage Integration, And A General Framework, Xuan Wang

Dissertations, Theses, and Capstone Projects

Contextual information has been widely used in many computer vision tasks, such as object detection, video action detection, image classification, etc. Recognizing a single object or action out of context could be sometimes very challenging, and context information may help improve the understanding of a scene or an event greatly. However, existing approaches design specific contextual information mechanisms for different detection tasks.

In this research, we first present a comprehensive survey of context understanding in computer vision, with a taxonomy to describe context in different types and levels. Then we proposed MultiCLU, a new multi-stage context learning and utilization framework, …


Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani Feb 2024

Deep Learning-Based Human Action Understanding In Videos, Elahe Vahdani

Dissertations, Theses, and Capstone Projects

The understanding of human actions in videos holds immense potential for technological advancement and societal betterment. This thesis explores fundamental aspects of this field, including action recognition in trimmed clips and action localization in untrimmed videos. Trimmed videos contain only one action instance, with moments before or after the action excluded from the video. However, the majority of videos captured in unconstrained environments, often referred to as untrimmed videos, are naturally unsegmented. Untrimmed videos are typically lengthy and may encompass multiple action instances, along with the moments preceding or following each action, as well as transitions between actions. In the …