Open Access. Powered by Scholars. Published by Universities.®

Graphics and Human Computer Interfaces Commons

Open Access. Powered by Scholars. Published by Universities.®

2,052 Full-Text Articles 3,135 Authors 681,850 Downloads 161 Institutions

All Articles in Graphics and Human Computer Interfaces

Faceted Search

2,052 full-text articles. Page 3 of 84.

Self-Supervised Pseudo Multi-Class Pre-Training For Unsupervised Anomaly Detection And Segmentation In Medical Images, Yu TIAN, Fengbei LIU, Guansong PANG, Yuanhong CHEN, Yuyuan LIU, Johan W. VERJANS, Rajvinder SINGH, Gustavo CARNEIRO 2023 Singapore Management University

Self-Supervised Pseudo Multi-Class Pre-Training For Unsupervised Anomaly Detection And Segmentation In Medical Images, Yu Tian, Fengbei Liu, Guansong Pang, Yuanhong Chen, Yuyuan Liu, Johan W. Verjans, Rajvinder Singh, Gustavo Carneiro

Research Collection School Of Computing and Information Systems

Unsupervised anomaly detection (UAD) methods are trained with normal (or healthy) images only, but during testing, they are able to classify normal and abnormal (or disease) images. UAD is an important medical image analysis (MIA) method to be applied in disease screening problems because the training sets available for those problems usually contain only normal images. However, the exclusive reliance on normal images may result in the learning of ineffective low-dimensional image representations that are not sensitive enough to detect and segment unseen abnormal lesions of varying size, appearance, and shape. Pre-training UAD methods with self-supervised learning, based on computer …


Graph Contrastive Learning With Stable And Scalable Spectral Encoding, Deyu BO, Yuan FANG, Yang LIU, Chuan SHI 2023 Singapore Management University

Graph Contrastive Learning With Stable And Scalable Spectral Encoding, Deyu Bo, Yuan Fang, Yang Liu, Chuan Shi

Research Collection School Of Computing and Information Systems

Graph contrastive learning (GCL) aims to learn representations by capturing the agreements between different graph views. Traditional GCL methods generate views in the spatial domain, but it has been recently discovered that the spectral domain also plays a vital role in complementing spatial views. However, existing spectral-based graph views either ignore the eigenvectors that encode valuable positional information, or suffer from high complexity when trying to address the instability of spectral features. To tackle these challenges, we first design an informative, stable, and scalable spectral encoder, termed EigenMLP, to learn effective representations from the spectral features. Theoretically, EigenMLP is invariant …


Video Sentiment Analysis For Child Safety, Yee Sen TAN, Nicole Anne Huiying TEO, Ezekiel En Zhe GHE, Jolie Zhi Yi FONG, Zhaoxia WANG 2023 Singapore Management University

Video Sentiment Analysis For Child Safety, Yee Sen Tan, Nicole Anne Huiying Teo, Ezekiel En Zhe Ghe, Jolie Zhi Yi Fong, Zhaoxia Wang

Research Collection School Of Computing and Information Systems

The proliferation of online video content underscores the critical need for effective sentiment analysis, particularly in safeguarding children from potentially harmful material. This research addresses this concern by presenting a multimodal analysis method for assessing video sentiment, categorizing it as either positive (child-friendly) or negative (potentially harmful). This method leverages three key components: text analysis, facial expression analysis, and audio analysis, including music mood analysis, resulting in a comprehensive sentiment assessment. Our evaluation results validate the effectiveness of this approach, making significant contributions to the field of video sentiment analysis and bolstering child safety measures. This research serves as a …


Mrim: Lightweight Saliency-Based Mixed-Resolution Imaging For Low-Power Pervasive Vision, Jiyan WU, Vithurson SUBASHARAN, Minh Anh Tuan TRAN, Kasun Pramuditha GAMLATH, Archan MISRA 2023 Singapore Management University

Mrim: Lightweight Saliency-Based Mixed-Resolution Imaging For Low-Power Pervasive Vision, Jiyan Wu, Vithurson Subasharan, Minh Anh Tuan Tran, Kasun Pramuditha Gamlath, Archan Misra

Research Collection School Of Computing and Information Systems

While many pervasive computing applications increasingly utilize real-time context extracted from a vision sensing infrastructure, the high energy overhead of DNN-based vision sensing pipelines remains a challenge for sustainable in-the-wild deployment. One common approach to reducing such energy overheads is the capture and transmission of lower-resolution images to an edge node (where the DNN inferencing task is executed), but this results in an accuracy-vs-energy tradeoff, as the DNN inference accuracy typically degrades with a drop in resolution. In this work, we introduce MRIM, a simple but effective framework to tackle this tradeoff. Under MRIM, the vision sensor platform first executes …


Performative Mixing For Immersive Audio, Brian A. Elizondo 2023 Louisiana State University and Agricultural and Mechanical College

Performative Mixing For Immersive Audio, Brian A. Elizondo

LSU Doctoral Dissertations

Immersive multichannel audio can be produced with specialized setups of loudspeakers, often surrounding the audience. These setups can feature as few as four loudspeakers or more than 300. Performative mixing in these environments requires a bespoke solution offering intuitive gestural control. Beyond the usual faders for gain control, advancements in multichannel sound demand interfaces capable of quickly positioning sounds between channels. The Quad Cartesian Positioner is such a solution in the form of a Eurorack module for surround mixing for use in live or studio performances.

Diffusion/mixing methods for live multichannel immersive music often rely on the repurposing of hardware …


The Propagation And Execution Of Malware In Images, Piper Hall 2023 Christopher Newport University

The Propagation And Execution Of Malware In Images, Piper Hall

Cybersecurity Undergraduate Research Showcase

Malware has become increasingly prolific and severe in its consequences as information systems mature and users become more reliant on computing in their daily lives. As cybercrime becomes more complex in its strategies, an often-overlooked manner of propagation is through images. In recent years, several high-profile vulnerabilities in image libraries have opened the door for threat actors to steal money and information from unsuspecting users. This paper will explore the mechanisms by which these exploits function and how they can be avoided.


User Feedback On Celebratory Technology Model For Reducing Stigma, Evelyn Lawrie, Daniel Dinh, Sav Avalos, Jack de Bruyn, Spencer Au, Christian Lopez, Ray Tan, Cyrus Fa'amafoe 2023 Chapman University

User Feedback On Celebratory Technology Model For Reducing Stigma, Evelyn Lawrie, Daniel Dinh, Sav Avalos, Jack De Bruyn, Spencer Au, Christian Lopez, Ray Tan, Cyrus Fa'amafoe

Student Scholar Symposium Abstracts and Posters

Social stigma is a complex manifestation that affects humanity, particularly individuals with disabilities and other marginalized groups, including those with physical, cognitive, and emotional conditions. Society often judges these individuals' interactions with the world, and many technologies designed to assist those with disabilities attempt to change their daily interactions and behaviors. Nonetheless, when the emphasis is placed on validating disabled identities, there is a potential for it to be seen as "inspiration porn." This approach might inadvertently reduce inclusivity and do little to challenge negative stereotypes; it can also lead to the objectification of individuals with disabilities. Therefore, this project …


Vrmovian - An Immersive Data Annotation Tool For Visual Analysis Of Human Interactions In Vr, Isaac Browen 2023 Chapman University

Vrmovian - An Immersive Data Annotation Tool For Visual Analysis Of Human Interactions In Vr, Isaac Browen

Student Scholar Symposium Abstracts and Posters

Understanding human behavior in virtual reality (VR) is a key component for developing intelligent systems to enhance human focused VR experiences. The ability to annotate human motion data proves to be a very useful way to analyze and understand human behavior. However, due to the complexity and multi-dimensionality of human activity data, it is necessary to develop software that can display the data in a comprehensible way and can support intuitive data annotation for developing machine learning models able recognize and assist human motions in VR (e.g., remote physical therapy). Although past research has been done to improve VR data …


Human-Centered Technologies For Inclusive Collection And Analysis Of Public-Generated Data, Mahmood Jasim 2023 University of Massachusetts Amherst

Human-Centered Technologies For Inclusive Collection And Analysis Of Public-Generated Data, Mahmood Jasim

Doctoral Dissertations

The meteoric rise in the popularity of public engagement platforms such as social media, customer review websites, and public input solicitation efforts strives for establishing an inclusive environment for the public to share their thoughts, ideas, opinions, and experiences. Many decisions made at a personal, local, or national scale are often fueled by data generated by the public. As such, inclusive collection, analysis, sensemaking, and utilization of pubic-generated data are crucial to support the exercise of successful decision-making processes. However, people often struggle to engage, participate, and share their opinions due to inaccessibility, the rigidity of traditional public engagement methods, …


Enhancing Search Engine Results: A Comparative Study Of Graph And Timeline Visualizations For Semantic And Temporal Relationship Discovery, Muhammad Shahiq Qureshi 2023 University of Denver

Enhancing Search Engine Results: A Comparative Study Of Graph And Timeline Visualizations For Semantic And Temporal Relationship Discovery, Muhammad Shahiq Qureshi

Electronic Theses and Dissertations

In today’s digital age, search engines have become indispensable tools for finding information among the corpus of billions of webpages. The standard that most search engines follow is to display search results in a list-based format arranged according to a ranking algorithm. Although this format is good for presenting the most relevant results to users, it fails to represent the underlying relations between different results. These relations, among others, can generally be of either a temporal or semantic nature. A user who wants to explore the results that are connected by those relations would have to make a manual effort …


Constructing Holistic Spatio-Temporal Scene Graph For Video Semantic Role Labeling, Yu ZHAO, Hao FEI, Yixin CAO, Bobo LI, Meishan ZHANG, Jianguo WEI, Min ZHANG, Tat-Seng CHUA 2023 Tianjin University

Constructing Holistic Spatio-Temporal Scene Graph For Video Semantic Role Labeling, Yu Zhao, Hao Fei, Yixin Cao, Bobo Li, Meishan Zhang, Jianguo Wei, Min Zhang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

As one of the core video semantic understanding tasks, Video Semantic Role Labeling (VidSRL) aims to detect the salient events from given videos, by recognizing the predict-argument event structures and the interrelationships between events. While recent endeavors have put forth methods for VidSRL, they can be mostly subject to two key drawbacks, including the lack of fine-grained spatial scene perception and the insufficiently modeling of video temporality. Towards this end, this work explores a novel holistic spatio-temporal scene graph (namely HostSG) representation based on the existing dynamic scene graph structures, which well model both the fine-grained spatial semantics and temporal …


Pro-Cap: Leveraging A Frozen Vision-Language Model For Hateful Meme Detection, Rui CAO, Ming Shan HEE, Adriel KUEK, Wen Haw CHONG, Roy Ka-Wei LEE, Jing JIANG 2023 Singapore Management University

Pro-Cap: Leveraging A Frozen Vision-Language Model For Hateful Meme Detection, Rui Cao, Ming Shan Hee, Adriel Kuek, Wen Haw Chong, Roy Ka-Wei Lee, Jing Jiang

Research Collection School Of Computing and Information Systems

Hateful meme detection is a challenging multimodal task that requires comprehension of both vision and language, as well as cross-modal interactions. Recent studies have tried to fine-tune pre-trained vision-language models (PVLMs) for this task. However, with increasing model sizes, it becomes important to leverage powerful PVLMs more efficiently, rather than simply fine-tuning them. Recently, researchers have attempted to convert meme images into textual captions and prompt language models for predictions. This approach has shown good performance but suffers from non-informative image captions. Considering the two factors mentioned above, we propose a probing-based captioning approach to leverage PVLMs in a zero-shot …


Turn-It-Up: Rendering Resistance For Knobs In Virtual Reality Through Undetectable Pseudo-Haptics, Martin FEICK, Andre ZENNER, Oscar ARIZA, Anthony TANG, Cihan BIYIKLI, Antonio KRUGER 2023 Universitat des Saarlandes

Turn-It-Up: Rendering Resistance For Knobs In Virtual Reality Through Undetectable Pseudo-Haptics, Martin Feick, Andre Zenner, Oscar Ariza, Anthony Tang, Cihan Biyikli, Antonio Kruger

Research Collection School Of Computing and Information Systems

Rendering haptic feedback for interactions with virtual objects is an essential part of effective virtual reality experiences. In this work, we explore providing haptic feedback for rotational manipulations, e.g., through knobs. We propose the use of a Pseudo-Haptic technique alongside a physical proxy knob to simulate various physical resistances. In a psychophysical experiment with 20 participants, we found that designers can introduce unnoticeable offsets between real and virtual rotations of the knob, and we report the corresponding detection thresholds. Based on these, we present the Pseudo-Haptic Resistance technique to convey physical resistance while applying only unnoticeable pseudo-haptic manipulation. Additionally, we …


Npf-200: A Multi-Modal Eye Fixation Dataset And Method For Non-Photorealistic Videos, Ziyu YANG, Sucheng REN, Zongwei WU, Nanxuan ZHAO, Junle WANG, Jing QIN, Shengfeng HE 2023 South China University of Technology

Npf-200: A Multi-Modal Eye Fixation Dataset And Method For Non-Photorealistic Videos, Ziyu Yang, Sucheng Ren, Zongwei Wu, Nanxuan Zhao, Junle Wang, Jing Qin, Shengfeng He

Research Collection School Of Computing and Information Systems

Non-photorealistic videos are in demand with the wave of the metaverse, but lack of sufficient research studies. This work aims to take a step forward to understand how humans perceive nonphotorealistic videos with eye fixation (i.e., saliency detection), which is critical for enhancing media production, artistic design, and game user experience. To fill in the gap of missing a suitable dataset for this research line, we present NPF-200, the first largescale multi-modal dataset of purely non-photorealistic videos with eye fixations. Our dataset has three characteristics: 1) it contains soundtracks that are essential according to vision and psychological studies; 2) it …


Disentangling Multi-View Representations Beyond Inductive Bias, Guanzhou KE, Yang YU, Guoqing CHAO, Xiaoli WANG, Chenyang XU, Shengfeng HE 2023 Singapore Management University

Disentangling Multi-View Representations Beyond Inductive Bias, Guanzhou Ke, Yang Yu, Guoqing Chao, Xiaoli Wang, Chenyang Xu, Shengfeng He

Research Collection School Of Computing and Information Systems

Multi-view (or -modality) representation learning aims to understand the relationships between different view representations. Existing methods disentangle multi-view representations into consistent and view-specific representations by introducing strong inductive biases, which can limit their generalization ability. In this paper, we propose a novel multi-view representation disentangling method that aims to go beyond inductive biases, ensuring both interpretability and generalizability of the resulting representations. Our method is based on the observation that discovering multi-view consistency in advance can determine the disentangling information boundary, leading to a decoupled learning objective. We also found that the consistency can be easily extracted by maximizing the …


Matk: The Meme Analytical Tool Kit, Ming Shan HEE, Aditi KUMARESAN, Nguyen Khoi HOANG, Nirmalendu PRAKASH, Rui CAO, Roy Ka-Wei LEE 2023 Singapore Management University

Matk: The Meme Analytical Tool Kit, Ming Shan Hee, Aditi Kumaresan, Nguyen Khoi Hoang, Nirmalendu Prakash, Rui Cao, Roy Ka-Wei Lee

Research Collection School Of Computing and Information Systems

The rise of social media platforms has brought about a new digital culture called memes. Memes, which combine visuals and text, can strongly influence public opinions on social and cultural issues. As a result, people have become interested in categorizing memes, leading to the development of various datasets and multimodal models that show promising results in this field. However, there is currently a lack of a single library that allows for the reproduction, evaluation, and comparison of these models using fair benchmarks and settings. To fill this gap, we introduce the Meme Analytical Tool Kit (MATK), an open-source toolkit specifically …


Voxelhap: A Toolkit For Constructing Proxies Providing Tactile And Kinesthetic Haptic Feedback In Virtual Reality, M. FEICK, C. BIYIKLI, K. GANI, A. WITTIG, Anthony TANG, A. KRÜGER 2023 Singapore Management University

Voxelhap: A Toolkit For Constructing Proxies Providing Tactile And Kinesthetic Haptic Feedback In Virtual Reality, M. Feick, C. Biyikli, K. Gani, A. Wittig, Anthony Tang, A. Krüger

Research Collection School Of Computing and Information Systems

Experiencing virtual environments is often limited to abstract interactions with objects. Physical proxies allow users to feel virtual objects, but are often inaccessible. We present the VoxelHap toolkit which enables users to construct highly functional proxy objects using Voxels and Plates. Voxels are blocks with special functionalities that form the core of each physical proxy. Plates increase a proxy’s haptic resolution, such as its shape, texture or weight. Beyond providing physical capabilities to realize haptic sensations, VoxelHap utilizes VR illusion techniques to expand its haptic resolution. We evaluated the capabilities of the VoxelHap toolkit through the construction of a range …


Revisiting Disentanglement And Fusion On Modality And Context In Conversational Multimodal Emotion Recognition, Bobo LI, Hao FEI, Lizi LIAO, Yu ZHAO, Chong TENG, Tat-Seng CHUA, Donghong Ji, Fei LI 2023 Singapore Management University

Revisiting Disentanglement And Fusion On Modality And Context In Conversational Multimodal Emotion Recognition, Bobo Li, Hao Fei, Lizi Liao, Yu Zhao, Chong Teng, Tat-Seng Chua, Donghong Ji, Fei Li

Research Collection School Of Computing and Information Systems

It has been a hot research topic to enable machines to understand human emotions in multimodal contexts under dialogue scenarios, which is tasked with multimodal emotion analysis in conversation (MM-ERC). MM-ERC has received consistent attention in recent years, where a diverse range of methods has been proposed for securing better task performance. Most existing works treat MM-ERC as a standard multimodal classification problem and perform multimodal feature disentanglement and fusion for maximizing feature utility. Yet after revisiting the characteristic of MM-ERC, we argue that both the feature multimodality and conversational contextualization should be properly modeled simultaneously during the feature disentanglement …


Improving Human-Automation Collaboration In Motion Planning, Torin J. Adamson 2023 University of New Mexico

Improving Human-Automation Collaboration In Motion Planning, Torin J. Adamson

Computer Science ETDs

Human-automation collaboration is becoming a part of everyday life as AI helps us drive, make decisions, and solve a variety of other tasks. However, safe and effective collaboration systems depend on factors in trust, communication, and more. Existing studies to explore these are typically carried out in laboratory settings, providing robust data under tight environmental control. However, human behavior evolves over time, driven by external factors that cannot be fully captured in single participation sessions. These factors form the "human context", contextualizing the behavioral data for a more complete understanding. In this thesis, video game adaptations upon conventional subject studies …


Evocative And Provocative Image-Making In The Age Of Generative Ai, Julian Kilker 2023 University of Nevada, Las Vegas

Evocative And Provocative Image-Making In The Age Of Generative Ai, Julian Kilker

Tradition Innovations in Arts, Design, and Media Higher Education

Editorial for inaugural AI-focused special issue of Tradition-Innovations in Arts, Design, and Media Higher Education, published under the auspices of the Alliance for the Arts in Research Universities (a2ru). Discusses three articles by five authors in this issue: (1) Choreographing Shadows: Interdisciplinary collaboration to orchestrate ethical image-making by Mark Burchick and Diana Pasulka; (2) Giving Up Control: Hybrid AI-augmented workflows for image-making by Joshua Vermillion; and (3) Hands are Hard: Unlearning how we talk about machine learning in the arts by Adam Hyland and Oscar Keyes.

Editing this special issue explored several key questions: What does “innovation” mean when …


Digital Commons powered by bepress