Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 21 of 21

Full-Text Articles in Physical Sciences and Mathematics

On The Feasibility Of Simple Transformer For Dynamic Graph Modeling, Yuxia Wu, Yuan Fang, Lizi Liao May 2024

On The Feasibility Of Simple Transformer For Dynamic Graph Modeling, Yuxia Wu, Yuan Fang, Lizi Liao

Research Collection School Of Computing and Information Systems

Dynamic graph modeling is crucial for understanding complex structures in web graphs, spanning applications in social networks, recommender systems, and more. Most existing methods primarily emphasize structural dependencies and their temporal changes. However, these approaches often overlook detailed temporal aspects or struggle with long-term dependencies. Furthermore, many solutions overly complicate the process by emphasizing intricate module designs to capture dynamic evolutions. In this work, we harness the strength of the Transformer’s self-attention mechanism, known for adeptly handling long-range dependencies in sequence modeling. Our approach offers a simple Transformer model, called SimpleDyG, tailored for dynamic graph modeling without complex modifications. We …


Enhancing Visual Grounding In Vision-Language Pre-Training With Position-Guided Text Prompts, Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan May 2024

Enhancing Visual Grounding In Vision-Language Pre-Training With Position-Guided Text Prompts, Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan

Research Collection School Of Computing and Information Systems

Vision-Language Pre-Training (VLP) has demonstrated remarkable potential in aligning image and text pairs, paving the way for a wide range of cross-modal learning tasks. Nevertheless, we have observed that VLP models often fall short in terms of visual grounding and localization capabilities, which are crucial for many downstream tasks, such as visual reasoning. In response, we introduce a novel Position-guided Text Prompt ( PTP ) paradigm to bolster the visual grounding abilities of cross-modal models trained with VLP. In the VLP phase, PTP divides an image into N x N blocks and employs a widely-used object detector to identify objects …


Learning Adversarial Semantic Embeddings For Zero-Shot Recognition In Open Worlds, Tianqi Li, Guansong Pang, Xiao Bai, Jin Zheng, Lei Zhou, Xin Ning May 2024

Learning Adversarial Semantic Embeddings For Zero-Shot Recognition In Open Worlds, Tianqi Li, Guansong Pang, Xiao Bai, Jin Zheng, Lei Zhou, Xin Ning

Research Collection School Of Computing and Information Systems

Zero-Shot Learning (ZSL) focuses on classifying samples of unseen classes with only their side semantic information presented during training. It cannot handle real-life, open-world scenarios where there are test samples of unknown classes for which neither samples (e.g., images) nor their side semantic information is known during training. Open-Set Recognition (OSR) is dedicated to addressing the unknown class issue, but existing OSR methods are not designed to model the semantic information of the unseen classes. To tackle this combined ZSL and OSR problem, we consider the case of “Zero-Shot Open-Set Recognition” (ZS-OSR), where a model is trained under the ZSL …


Diffusion-Based Negative Sampling On Graphs For Link Prediction, Yuan Fang, Yuan Fang May 2024

Diffusion-Based Negative Sampling On Graphs For Link Prediction, Yuan Fang, Yuan Fang

Research Collection School Of Computing and Information Systems

Link prediction is a fundamental task for graph analysis with important applications on the Web, such as social network analysis and recommendation systems, etc. Modern graph link prediction methods often employ a contrastive approach to learn robust node representations, where negative sampling is pivotal. Typical negative sampling methods aim to retrieve hard examples based on either predefined heuristics or automatic adversarial approaches, which might be inflexible or difficult to control. Furthermore, in the context of link prediction, most previous methods sample negative nodes from existing substructures of the graph, missing out on potentially more optimal samples in the latent space. …


Environmental, Social, And Governance (Esg) And Artificial Intelligence In Finance: State-Of-The-Art And Research Takeaways, Trstan Lim Apr 2024

Environmental, Social, And Governance (Esg) And Artificial Intelligence In Finance: State-Of-The-Art And Research Takeaways, Trstan Lim

Research Collection School Of Computing and Information Systems

The rapidly growing research landscape in finance, encompassing environmental, social, and governance (ESG) topics and associated Artificial Intelligence (AI) applications, presents challenges for both new researchers and seasoned practitioners. This study aims to systematically map the research area, identify knowledge gaps, and examine potential research areas for researchers and practitioners. The investigation focuses on three primary research questions: the main research themes concerning ESG and AI in finance, the evolution of research intensity and interest in these areas, and the application and evolution of AI techniques specifically in research studies within the ESG and AI in finance domain. Eight archetypical …


Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Qi Guo, Shangqing Liu, Junming Cao, Xiaohong Li, Xin Peng, Xiaofei Xie, Bihuan Chen Apr 2024

Exploring The Potential Of Chatgpt In Automated Code Refinement: An Empirical Study, Qi Guo, Shangqing Liu, Junming Cao, Xiaohong Li, Xin Peng, Xiaofei Xie, Bihuan Chen

Research Collection School Of Computing and Information Systems

Code review is an essential activity for ensuring the quality and maintainability of software projects. However, it is a time-consuming and often error-prone task that can significantly impact the development process. Recently, ChatGPT, a cutting-edge language model, has demonstrated impressive performance in various natural language processing tasks, suggesting its potential to automate code review processes. However, it is still unclear how well ChatGPT performs in code review tasks. To fill this gap, in this paper, we conduct the first empirical study to understand the capabilities of ChatGPT in code review tasks, specifically focusing on automated code refinement based on given …


Conditional Neural Heuristic For Multiobjective Vehicle Routing Problems, Mingfeng Fan, Yaoxin Wu, Zhiguang Cao, Wen Song, Guillaume Sartoretti, Huan Liu, Guohua Wu Mar 2024

Conditional Neural Heuristic For Multiobjective Vehicle Routing Problems, Mingfeng Fan, Yaoxin Wu, Zhiguang Cao, Wen Song, Guillaume Sartoretti, Huan Liu, Guohua Wu

Research Collection School Of Computing and Information Systems

Existing neural heuristics for multiobjective vehicle routing problems (MOVRPs) are primarily conditioned on instance context, which failed to appropriately exploit preference and problem size, thus holding back the performance. To thoroughly unleash the potential, we propose a novel conditional neural heuristic (CNH) that fully leverages the instance context, preference, and size with an encoder–decoder structured policy network. Particularly, in our CNH, we design a dual-attention-based encoder to relate preferences and instance contexts, so as to better capture their joint effect on approximating the exact Pareto front (PF). We also design a size-aware decoder based on the sinusoidal encoding to explicitly …


Knowledge Generation For Zero-Shot Knowledge-Based Vqa, Rui Cao, Jing Jiang Mar 2024

Knowledge Generation For Zero-Shot Knowledge-Based Vqa, Rui Cao, Jing Jiang

Research Collection School Of Computing and Information Systems

Previous solutions to knowledge-based visual question answering (K-VQA) retrieve knowledge from external knowledge bases and use supervised learning to train the K-VQA model. Recently pre-trained LLMs have been used as both a knowledge source and a zero-shot QA model for K-VQA and demonstrated promising results. However, these recent methods do not explicitly show the knowledge needed to answer the questions and thus lack interpretability. Inspired by recent work on knowledge generation from LLMs for text-based QA, in this work we propose and test a similar knowledge-generation-based K-VQA method, which first generates knowledge from an LLM and then incorporates the generated …


T-Sciq: Teaching Multimodal Chain-Of-Thought Reasoning Via Large Language Model Signals For Science Question Answering, Lei Wang, Yi Hu, Jiabang He, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen Mar 2024

T-Sciq: Teaching Multimodal Chain-Of-Thought Reasoning Via Large Language Model Signals For Science Question Answering, Lei Wang, Yi Hu, Jiabang He, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with …


Monocular Bev Perception Of Road Scenes Via Front-To-Top View Projection, Wenxi Liu, Qi Li, Weixiang Yang, Jiaxin Cai, Yuanhong Yu, Yuexin Ma, Shengfeng He, Jia Pan Mar 2024

Monocular Bev Perception Of Road Scenes Via Front-To-Top View Projection, Wenxi Liu, Qi Li, Weixiang Yang, Jiaxin Cai, Yuanhong Yu, Yuexin Ma, Shengfeng He, Jia Pan

Research Collection School Of Computing and Information Systems

HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To push the limits of the technology, we present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view given a front-view monocular image only. We propose a front-to-top view projection (FTVP) module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen …


Temporal Implicit Multimodal Networks For Investment And Risk Management, Meng Kiat Gary Ang, Ee-Peng Lim Mar 2024

Temporal Implicit Multimodal Networks For Investment And Risk Management, Meng Kiat Gary Ang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Many deep learning works on financial time-series forecasting focus on predicting future prices/returns of individual assets with numerical price-related information for trading, and hence propose models designed for univariate, single-task, and/or unimodal settings. Forecasting for investment and risk management involves multiple tasks in multivariate settings: forecasts of expected returns and risks of assets in portfolios, and correlations between these assets. As different sources/types of time-series influence future returns, risks, and correlations of assets in different ways, it is also important to capture time-series from different modalities. Hence, this article addresses financial time-series forecasting for investment and risk management in a …


Public Acceptance Of Using Artificial Intelligence-Assisted Weight Management Apps In High-Income Southeast Asian Adults With Overweight And Obesity: A Cross-Sectional Study, Han Shi Jocelyn Chew, Palakorn Achananuparp, Palakorn Achananuparp, Nicholas W. S. Chew, Yip Han Chin, Yujia Gao, Bok Yan Jimmy So, Asim Shabbir, Ee-Peng Lim, Kee Yuan Ngiam Feb 2024

Public Acceptance Of Using Artificial Intelligence-Assisted Weight Management Apps In High-Income Southeast Asian Adults With Overweight And Obesity: A Cross-Sectional Study, Han Shi Jocelyn Chew, Palakorn Achananuparp, Palakorn Achananuparp, Nicholas W. S. Chew, Yip Han Chin, Yujia Gao, Bok Yan Jimmy So, Asim Shabbir, Ee-Peng Lim, Kee Yuan Ngiam

Research Collection School Of Computing and Information Systems

Introduction: With in increase in interest to incorporate artificial intelligence (AI) into weight management programs, we aimed to examine user perceptions of AI-based mobile apps for weight management in adults with overweight and obesity. Methods: 280 participants were recruited between May and November 2022. Participants completed a questionnaire on sociodemographic profiles, Unified Theory of Acceptance and Use of Technology 2 (UTAUT2), and Self-Regulation of Eating Behavior Questionnaire. Structural equation modeling was performed using R. Model fit was tested using maximum-likelihood generalized unweighted least squares. Associations between influencing factors were analyzed using correlation and linear regression. Results: 271 participant responses were …


Glop: Learning Global Partition And Local Construction For Solving Large-Scale Routing Problems In Real-Time, Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, Fanzhang Li Feb 2024

Glop: Learning Global Partition And Local Construction For Solving Large-Scale Routing Problems In Real-Time, Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, Fanzhang Li

Research Collection School Of Computing and Information Systems

The recent end-to-end neural solvers have shown promise for small-scale routing problems but suffered from limited real-time scaling-up performance. This paper proposes GLOP (Global and Local Optimization Policies), a unified hierarchical framework that efficiently scales toward large-scale routing problems. GLOP partitions large routing problems into Travelling Salesman Problems (TSPs) and TSPs into Shortest Hamiltonian Path Problems. For the first time, we hybridize non-autoregressive neural heuristics for coarse-grained problem partitions and autoregressive neural heuristics for fine-grained route constructions, leveraging the scalability of the former and the meticulousness of the latter. Experimental results show that GLOP achieves competitive and state-of-the-art real-time performance …


Delving Into Multimodal Prompting For Fine-Grained Visual Classification, Xin Jiang, Hao Tang, Junyao Gao, Xiaoyu Du, Shengfeng He, Zechao Li Feb 2024

Delving Into Multimodal Prompting For Fine-Grained Visual Classification, Xin Jiang, Hao Tang, Junyao Gao, Xiaoyu Du, Shengfeng He, Zechao Li

Research Collection School Of Computing and Information Systems

Fine-grained visual classification (FGVC) involves categorizing fine subdivisions within a broader category, which poses challenges due to subtle inter-class discrepancies and large intra-class variations. However, prevailing approaches primarily focus on uni-modal visual concepts. Recent advancements in pre-trained vision-language models have demonstrated remarkable performance in various high-level vision tasks, yet the applicability of such models to FGVC tasks remains uncertain. In this paper, we aim to fully exploit the capabilities of cross-modal description to tackle FGVC tasks and propose a novel multimodal prompting solution, denoted as MP-FGVC, based on the contrastive language-image pertaining (CLIP) model. Our MP-FGVC comprises a multimodal prompts …


Handling Long And Richly Constrained Tasks Through Constrained Hierarchical Reinforcement Learning, Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham Feb 2024

Handling Long And Richly Constrained Tasks Through Constrained Hierarchical Reinforcement Learning, Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham

Research Collection School Of Computing and Information Systems

Safety in goal directed Reinforcement Learning (RL) settings has typically been handled through constraints over trajectories and have demonstrated good performance in primarily short horizon tasks. In this paper, we are specifically interested in the problem of solving temporally extended decision making problems such as robots cleaning different areas in a house while avoiding slippery and unsafe areas (e.g., stairs) and retaining enough charge to move to a charging dock; in the presence of complex safety constraints. Our key contribution is a (safety) Constrained Search with Hierarchical Reinforcement Learning (CoSHRL) mechanism that combines an upper level constrained search agent (which …


Leveraging Llms And Generative Models For Interactive Known-Item Video Search, Zhixin Ma, Jiaxin Wu, Chong-Wah Ngo Feb 2024

Leveraging Llms And Generative Models For Interactive Known-Item Video Search, Zhixin Ma, Jiaxin Wu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

While embedding techniques such as CLIP have considerably boosted search performance, user strategies in interactive video search still largely operate on a trial-and-error basis. Users are often required to manually adjust their queries and carefully inspect the search results, which greatly rely on the users’ capability and proficiency. Recent advancements in large language models (LLMs) and generative models offer promising avenues for enhancing interactivity in video retrieval and reducing the personal bias in query interpretation, particularly in the known-item search. Specifically, LLMs can expand and diversify the semantics of the queries while avoiding grammar mistakes or the language barrier. In …


Mitigating Fine-Grained Hallucination By Fine-Tuning Large Vision-Language Models With Caption Rewrites, Lei Wang, Jiabang He, Shenshen Li, Ning Liu, Ee-Peng Lim Feb 2024

Mitigating Fine-Grained Hallucination By Fine-Tuning Large Vision-Language Models With Caption Rewrites, Lei Wang, Jiabang He, Shenshen Li, Ning Liu, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Large language models (LLMs) have shown remarkable performance in natural language processing (NLP) tasks. To comprehend and execute diverse human instructions over image data, instruction-tuned large vision-language models (LVLMs) have been introduced. However, LVLMs may suffer from different types of object hallucinations. Nevertheless, LVLMs are evaluated for coarse-grained object hallucinations only (i.e., generated objects non-existent in the input image). The fine-grained object attributes and behaviors non-existent in the image may still be generated but not measured by the current evaluation methods. In this paper, we thus focus on reducing fine-grained hallucinations of LVLMs. We propose ReCaption, a framework that consists …


Conversational Localization: Indoor Human Localization Through Intelligent Conversation, Sheshadri Smitha, Kotaro Hara Jan 2024

Conversational Localization: Indoor Human Localization Through Intelligent Conversation, Sheshadri Smitha, Kotaro Hara

Research Collection School Of Computing and Information Systems

We propose a novel sensorless approach to indoor localization by leveraging natural language conversations with users, which we call conversational localization. To show the feasibility of conversational localization, we develop a proof-of-concept system that guides users to describe their surroundings in a chat and estimates their position based on the information they provide. We devised a modular architecture for our system with four modules. First, we construct an entity database with available image-based floor maps. Second, we enable the dynamic identification and scoring of information provided by users through our utterance processing module. Then, we implement a conversational agent that …


Active Discovering New Slots For Task-Oriented Conversation, Yuxia Wu, Tianhao Dai, Zhedong Zheng, Lizi Liao Jan 2024

Active Discovering New Slots For Task-Oriented Conversation, Yuxia Wu, Tianhao Dai, Zhedong Zheng, Lizi Liao

Research Collection School Of Computing and Information Systems

Existing task-oriented conversational systems heavily rely on domain ontologies with pre-defined slots and candidate values. In practical settings, these prerequisites are hard to meet, due to the emerging new user requirements and ever-changing scenarios. To mitigate these issues for better interaction performance, there are efforts working towards detecting out-of-vocabulary values or discovering new slots under unsupervised or semi-supervised learning paradigms. However, overemphasizing on the conversation data patterns alone induces these methods to yield noisy and arbitrary slot results. To facilitate the pragmatic utility, real-world systems tend to provide a stringent amount of human labeling quota, which offers an authoritative way …


Affinity Uncertainty-Based Hard Negative Mining In Graph Contrastive Learning, Chaoxi Niu, Guansong Pang, Ling Chen Jan 2024

Affinity Uncertainty-Based Hard Negative Mining In Graph Contrastive Learning, Chaoxi Niu, Guansong Pang, Ling Chen

Research Collection School Of Computing and Information Systems

Hard negative mining has shown effective in enhancing self-supervised contrastive learning (CL) on diverse data types, including graph CL (GCL). The existing hardness-aware CL methods typically treat negative instances that are most similar to the anchor instance as hard negatives, which helps improve the CL performance, especially on image data. However, this approach often fails to identify the hard negatives but leads to many false negatives on graph data. This is mainly due to that the learned graph representations are not sufficiently discriminative due to oversmooth representations and/or non-independent and identically distributed (non-i.i.d.) issues in graph data. To tackle this …


Continual Learning, Fast And Slow, Quang Anh Pham, Chenghao Liu, Steven C. H. Hoi Jan 2024

Continual Learning, Fast And Slow, Quang Anh Pham, Chenghao Liu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

According to the Complementary Learning Systems (CLS) theory (McClelland et al. 1995) in neuroscience, humans do effective continual learning through two complementary systems: a fast learning system centered on the hippocampus for rapid learning of the specifics, individual experiences; and a slow learning system located in the neocortex for the gradual acquisition of structured knowledge about the environment. Motivated by this theory, we propose DualNets (for Dual Networks), a general continual learning framework comprising a fast learning system for supervised learning of pattern-separated representation from specific tasks and a slow learning system for representation learning of task-agnostic general representation via …