Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Research Collection School Of Computing and Information Systems

Series

Discipline
Keyword
Publication Year
File Type

Articles 31 - 60 of 6415

Full-Text Articles in Physical Sciences and Mathematics

Harnessing The Advances Of Meda To Optimize Multi-Puf For Enhancing Ip Security Of Biochips, Chen Dong, Xiaodong Guo, Sihuang Lian, Yinan Yao, Zhenyi Chen, Yang Yang, Zhanghui Liu Mar 2024

Harnessing The Advances Of Meda To Optimize Multi-Puf For Enhancing Ip Security Of Biochips, Chen Dong, Xiaodong Guo, Sihuang Lian, Yinan Yao, Zhenyi Chen, Yang Yang, Zhanghui Liu

Research Collection School Of Computing and Information Systems

Digital microfluidic biochips (DMFBs) have a significant stride in the applications of medicine and the biochemistry in recent years. DMFBs based on micro-electrode-dot-array (MEDA) architecture, as the next-generation DMFBs, aim to overcome drawbacks of conventional DMFBs, such as droplet size restriction, low accuracy, and poor sensing ability. Since the potential market value of MEDA biochips is vast, it is of paramount importance to explore approaches to protect the intellectual property (IP) of MEDA biochips during the development process. In this paper, an IP authentication strategy based on the multi-PUF applied to MEDA biochips is presented, called bioMPUF, consisting of Delay …


Temporal Implicit Multimodal Networks For Investment And Risk Management, Meng Kiat Gary Ang, Ee-Peng Lim Mar 2024

Temporal Implicit Multimodal Networks For Investment And Risk Management, Meng Kiat Gary Ang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Many deep learning works on financial time-series forecasting focus on predicting future prices/returns of individual assets with numerical price-related information for trading, and hence propose models designed for univariate, single-task, and/or unimodal settings. Forecasting for investment and risk management involves multiple tasks in multivariate settings: forecasts of expected returns and risks of assets in portfolios, and correlations between these assets. As different sources/types of time-series influence future returns, risks, and correlations of assets in different ways, it is also important to capture time-series from different modalities. Hence, this article addresses financial time-series forecasting for investment and risk management in a …


Community Similarity Based On User Profile Joins, Konstantinos Theocharidis, Hady Wirawan Lauw Mar 2024

Community Similarity Based On User Profile Joins, Konstantinos Theocharidis, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Similarity joins on multidimensional data are crucial operators for recommendation purposes. The classic ��-join problem finds all pairs of points within �� distance to each other among two ��-dimensional datasets. In this paper, we consider a novel and alternative version of ��-join named community similarity based on user profile joins (CSJ). The aim of CSJ problem is, given two communities having a set of ��-dimensional users, to find how similar are the communities by matching every single pair of users (a user can be matched with at most one other user) having an absolute difference of at most �� per …


Hypergraphs With Attention On Reviews For Explainable Recommendation, Theis E. Jendal, Trung Hoang Le, Hady Wirawan Lauw, Matteo Lissandrini, Peter Dolog, Katja Hose Mar 2024

Hypergraphs With Attention On Reviews For Explainable Recommendation, Theis E. Jendal, Trung Hoang Le, Hady Wirawan Lauw, Matteo Lissandrini, Peter Dolog, Katja Hose

Research Collection School Of Computing and Information Systems

Given a recommender system based on reviews, the challenges are how to effectively represent the review data and how to explain the produced recommendations. We propose a novel review-specific Hypergraph (HG) model, and further introduce a model-agnostic explainability module. The HG model captures high-order connections between users, items, aspects, and opinions while maintaining information about the review. The explainability module can use the HG model to explain a prediction generated by any model. We propose a path-restricted review-selection method biased by the user preference for item reviews and propose a novel explanation method based on a review graph. Experiments on …


T-Sciq: Teaching Multimodal Chain-Of-Thought Reasoning Via Large Language Model Signals For Science Question Answering, Lei Wang, Yi Hu, Jiabang He, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen Mar 2024

T-Sciq: Teaching Multimodal Chain-Of-Thought Reasoning Via Large Language Model Signals For Science Question Answering, Lei Wang, Yi Hu, Jiabang He, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with …


Non-Monotonic Generation Of Knowledge Paths For Context Understanding, Pei-Chi Lo, Ee-Peng Lim Mar 2024

Non-Monotonic Generation Of Knowledge Paths For Context Understanding, Pei-Chi Lo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Knowledge graphs can be used to enhance text search and access by augmenting textual content with relevant background knowledge. While many large knowledge graphs are available, using them to make semantic connections between entities mentioned in the textual content remains to be a difficult task. In this work, we therefore introduce contextual path generation (CPG) which refers to the task of generating knowledge paths, contextual path, to explain the semantic connections between entities mentioned in textual documents with given knowledge graph. To perform CPG task well, one has to address its three challenges, namely path relevance, incomplete knowledge graph, and …


Stopguess: A Framework For Public-Key Authenticated Encryption With Keyword Search, Tao Xiang, Zhongming Wang, Biwen Chen, Xiaoguo Li, Peng Wang, Fei Chen Mar 2024

Stopguess: A Framework For Public-Key Authenticated Encryption With Keyword Search, Tao Xiang, Zhongming Wang, Biwen Chen, Xiaoguo Li, Peng Wang, Fei Chen

Research Collection School Of Computing and Information Systems

Public key encryption with keyword search (PEKS) allows users to search on encrypted data without leaking the keyword information from the ciphertexts. But it does not preserve keyword privacy within the trapdoors, because an adversary (e.g., untrusted server) might launch inside keyword-guessing attacks (IKGA) to guess keywords from the trapdoors. In recent years, public key authenticated encryption with keyword search (PAEKS) has become a promising primitive to counter the IKGA. However, existing PAEKS schemes focus on the concrete construction of PAEKS, making them unable to support modular construction, intuitive proof, or flexible extension. In this paper, our proposal called “StopGuess” …


Screening Through A Broad Pool: Towards Better Diversity For Lexically Constrained Text Generation, Changsen Yuan, Heyan Huang, Yixin Cao, Qianwen Cao Mar 2024

Screening Through A Broad Pool: Towards Better Diversity For Lexically Constrained Text Generation, Changsen Yuan, Heyan Huang, Yixin Cao, Qianwen Cao

Research Collection School Of Computing and Information Systems

Lexically constrained text generation (CTG) is to generate text that contains given constrained keywords. However, the text diversity of existing models is still unsatisfactory. In this paper, we propose a lightweight dynamic refinement strategy that aims at increasing the randomness of inference to improve generation richness and diversity while maintaining a high level of fluidity and integrity. Our basic idea is to enlarge the number and length of candidate sentences in each iteration, and choose the best for subsequent refinement. On the one hand, different from previous works, which carefully insert one token between two words per action, we insert …


Sigmadiff: Semantics-Aware Deep Graph Matching For Pseudocode Diffing, Lian Gao, Yu Qu, Sheng Yu, Yue Duan, Heng Yin Mar 2024

Sigmadiff: Semantics-Aware Deep Graph Matching For Pseudocode Diffing, Lian Gao, Yu Qu, Sheng Yu, Yue Duan, Heng Yin

Research Collection School Of Computing and Information Systems

Pseudocode diffing precisely locates similar parts and captures differences between the decompiled pseudocode of two given binaries. It is particularly useful in many security scenarios such as code plagiarism detection, lineage analysis, patch, vulnerability analysis, etc. However, existing pseudocode diffing and binary diffing tools suffer from low accuracy and poor scalability, since they either rely on manually-designed heuristics (e.g., Diaphora) or heavy computations like matrix factorization (e.g., DeepBinDiff). To address the limitations, in this paper, we propose a semantics-aware, deep neural network-based model called SIGMADIFF. SIGMADIFF first constructs IR (Intermediate Representation) level interprocedural program dependency graphs (IPDGs). Then it uses …


Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt Mar 2024

Fixing Your Own Smells: Adding A Mistake-Based Familiarization Step When Teaching Code Refactoring, Ivan Wei Han Tan, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

Programming problems can be solved in a multitude of functionally correct ways, but the quality of these solutions (e.g. readability, maintainability) can vary immensely. When code quality is poor, symptoms emerge in the form of 'code smells', which are specific negative characteristics (e.g. duplicate code) that can be resolved by applying refactoring patterns. Many undergraduate computing curricula train students on this software engineering practice, often doing so via exercises on unfamiliar instructor-provided code. Our observation, however, is that this makes it harder for novices to internalise refactoring as part of their own development practices. In this paper, we propose a …


Ditmos: Delving Into Diverse Tiny-Model Selection On Microcontrollers, Xiao Ma, Shengfeng He, Hezhe Qiao, Dong Ma Mar 2024

Ditmos: Delving Into Diverse Tiny-Model Selection On Microcontrollers, Xiao Ma, Shengfeng He, Hezhe Qiao, Dong Ma

Research Collection School Of Computing and Information Systems

Enabling efficient and accurate deep neural network (DNN) inference on microcontrollers is non-trivial due to the constrained on-chip resources. Current methodologies primarily focus on compressing larger models yet at the expense of model accuracy. In this paper, we rethink the problem from the inverse perspective by constructing small/weak models directly and improving their accuracy. Thus, we introduce DiTMoS, a novel DNN training and inference framework with a selectorclassifiers architecture, where the selector routes each input sample to the appropriate classifier for classification. DiTMoS is grounded on a key insight: a composition of weak models can exhibit high diversity and the …


T-Pickseer: Visual Analysis Of Taxi Pick-Up Point Selection Behavior, Shuxian Gu, Yemo Dai, Zezheng Feng, Yong Wang, Haipeng Zeng Mar 2024

T-Pickseer: Visual Analysis Of Taxi Pick-Up Point Selection Behavior, Shuxian Gu, Yemo Dai, Zezheng Feng, Yong Wang, Haipeng Zeng

Research Collection School Of Computing and Information Systems

Taxi drivers often take much time to navigate the streets to look for passengers, which leads to high vacancy rates and wasted resources. Empty taxi cruising remains a big concern for taxi companies. Analyzing the pick-up point selection behavior can solve this problem effectively, providing suggestions for taxi management and dispatch. Many studies have been devoted to analyzing and recommending hotspot regions of pick-up points, which can make it easier for drivers to pick-up passengers. However, the selection of pick-up points is complex and affected by multiple factors, such as convenience and traffic management. Most existing approaches cannot produce satisfactory …


Monocular Bev Perception Of Road Scenes Via Front-To-Top View Projection, Wenxi Liu, Qi Li, Weixiang Yang, Jiaxin Cai, Yuanhong Yu, Yuexin Ma, Shengfeng He, Jia Pan Mar 2024

Monocular Bev Perception Of Road Scenes Via Front-To-Top View Projection, Wenxi Liu, Qi Li, Weixiang Yang, Jiaxin Cai, Yuanhong Yu, Yuexin Ma, Shengfeng He, Jia Pan

Research Collection School Of Computing and Information Systems

HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To push the limits of the technology, we present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view given a front-view monocular image only. We propose a front-to-top view projection (FTVP) module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen …


Conditional Neural Heuristic For Multiobjective Vehicle Routing Problems, Mingfeng Fan, Yaoxin Wu, Zhiguang Cao, Wen Song, Guillaume Sartoretti, Huan Liu, Guohua Wu Mar 2024

Conditional Neural Heuristic For Multiobjective Vehicle Routing Problems, Mingfeng Fan, Yaoxin Wu, Zhiguang Cao, Wen Song, Guillaume Sartoretti, Huan Liu, Guohua Wu

Research Collection School Of Computing and Information Systems

Existing neural heuristics for multiobjective vehicle routing problems (MOVRPs) are primarily conditioned on instance context, which failed to appropriately exploit preference and problem size, thus holding back the performance. To thoroughly unleash the potential, we propose a novel conditional neural heuristic (CNH) that fully leverages the instance context, preference, and size with an encoder–decoder structured policy network. Particularly, in our CNH, we design a dual-attention-based encoder to relate preferences and instance contexts, so as to better capture their joint effect on approximating the exact Pareto front (PF). We also design a size-aware decoder based on the sinusoidal encoding to explicitly …


On The Effects Of Information Asymmetry In Digital Currency Trading, Kwansoo Kim, Robert John Kauffman Mar 2024

On The Effects Of Information Asymmetry In Digital Currency Trading, Kwansoo Kim, Robert John Kauffman

Research Collection School Of Computing and Information Systems

We report on two studies that examine how social sentiment influences information asymmetry in digital currency markets. We also assess whether cryptocurrency can be an investment vehicle, as opposed to only an instrument for asset speculation. Using a dataset on transactions from an exchange in South Korea and sentiment from Korean social media in 2018, we conducted a study of different trading behavior under two cryptocurrency trading market microstructures: a bid-ask spread dealer's market and a continuous trading buy-sell, immediate trade execution market. Our results highlight the impacts of positive and negative trader social sentiment valences on the effects of …


Transiam: Aggregating Multi-Modal Visual Features With Locality For Medical Image Segmentation, Xuejian Li, Shiqiang Ma, Junhai Xu, Jijun Tang, Shengfeng He, Fei Guo Mar 2024

Transiam: Aggregating Multi-Modal Visual Features With Locality For Medical Image Segmentation, Xuejian Li, Shiqiang Ma, Junhai Xu, Jijun Tang, Shengfeng He, Fei Guo

Research Collection School Of Computing and Information Systems

Automatic segmentation of medical images plays an important role in the diagnosis of diseases. On single-modal data, convolutional neural networks have demonstrated satisfactory performance. However, multi-modal data encompasses a greater amount of information rather than single-modal data. Multi-modal data can be effectively used to improve the segmentation accuracy of regions of interest by analyzing both spatial and temporal information. In this study, we propose a dual-path segmentation model for multi-modal medical images, named TranSiam. Taking into account that there is a significant diversity between the different modalities, TranSiam employs two parallel CNNs to extract the features which are specific to …


Simulated Annealing With Reinforcement Learning For The Set Team Orienteering Problem With Time Windows, Vincent F. Yu, Nabila Y. Salsabila, Shih-W Lin, Aldy Gunawan Mar 2024

Simulated Annealing With Reinforcement Learning For The Set Team Orienteering Problem With Time Windows, Vincent F. Yu, Nabila Y. Salsabila, Shih-W Lin, Aldy Gunawan

Research Collection School Of Computing and Information Systems

This research investigates the Set Team Orienteering Problem with Time Windows (STOPTW), a new variant of the well-known Team Orienteering Problem with Time Windows and Set Orienteering Problem. In the STOPTW, customers are grouped into clusters. Each cluster is associated with a profit attainable when a customer in the cluster is visited within the customer's time window. A Mixed Integer Linear Programming model is formulated for STOPTW to maximizing total profit while adhering to time window constraints. Since STOPTW is an NP-hard problem, a Simulated Annealing with Reinforcement Learning (SARL) algorithm is developed. The proposed SARL incorporates the core concepts …


Handling Long And Richly Constrained Tasks Through Constrained Hierarchical Reinforcement Learning, Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham Feb 2024

Handling Long And Richly Constrained Tasks Through Constrained Hierarchical Reinforcement Learning, Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham

Research Collection School Of Computing and Information Systems

Safety in goal directed Reinforcement Learning (RL) settings has typically been handled through constraints over trajectories and have demonstrated good performance in primarily short horizon tasks. In this paper, we are specifically interested in the problem of solving temporally extended decision making problems such as robots cleaning different areas in a house while avoiding slippery and unsafe areas (e.g., stairs) and retaining enough charge to move to a charging dock; in the presence of complex safety constraints. Our key contribution is a (safety) Constrained Search with Hierarchical Reinforcement Learning (CoSHRL) mechanism that combines an upper level constrained search agent (which …


Delving Into Multimodal Prompting For Fine-Grained Visual Classification, Xin Jiang, Hao Tang, Junyao Gao, Xiaoyu Du, Shengfeng He, Zechao Li Feb 2024

Delving Into Multimodal Prompting For Fine-Grained Visual Classification, Xin Jiang, Hao Tang, Junyao Gao, Xiaoyu Du, Shengfeng He, Zechao Li

Research Collection School Of Computing and Information Systems

Fine-grained visual classification (FGVC) involves categorizing fine subdivisions within a broader category, which poses challenges due to subtle inter-class discrepancies and large intra-class variations. However, prevailing approaches primarily focus on uni-modal visual concepts. Recent advancements in pre-trained vision-language models have demonstrated remarkable performance in various high-level vision tasks, yet the applicability of such models to FGVC tasks remains uncertain. In this paper, we aim to fully exploit the capabilities of cross-modal description to tackle FGVC tasks and propose a novel multimodal prompting solution, denoted as MP-FGVC, based on the contrastive language-image pertaining (CLIP) model. Our MP-FGVC comprises a multimodal prompts …


M3sa: Multimodal Sentiment Analysis Based On Multi-Scale Feature Extraction And Multi-Task Learning, Changkai Lin, Hongju Cheng, Qiang Rao, Yang Yang Feb 2024

M3sa: Multimodal Sentiment Analysis Based On Multi-Scale Feature Extraction And Multi-Task Learning, Changkai Lin, Hongju Cheng, Qiang Rao, Yang Yang

Research Collection School Of Computing and Information Systems

Sentiment analysis plays an indispensable part in human-computer interaction. Multimodal sentiment analysis can overcome the shortcomings of unimodal sentiment analysis by fusing multimodal data. However, how to extracte improved feature representations and how to execute effective modality fusion are two crucial problems in multimodal sentiment analysis. Traditional work uses simple sub-models for feature extraction, and they ignore features of different scales and fuse different modalities of data equally, making it easier to incorporate extraneous information and affect analysis accuracy. In this paper, we propose a Multimodal Sentiment Analysis model based on Multi-scale feature extraction and Multi-task learning (M 3 SA). …


Reverse Multi-Choice Dialogue Commonsense Inference With Graph-Of-Thought, Li Zheng, Hao Fei, Fei Li, Bobo Li, Lizi Liao, Donghong Ji, Chong Teng Feb 2024

Reverse Multi-Choice Dialogue Commonsense Inference With Graph-Of-Thought, Li Zheng, Hao Fei, Fei Li, Bobo Li, Lizi Liao, Donghong Ji, Chong Teng

Research Collection School Of Computing and Information Systems

With the proliferation of dialogic data across the Internet, the Dialogue Commonsense Multi-choice Question Answering (DC-MCQ) task has emerged as a response to the challenge of comprehending user queries and intentions. Although prevailing methodologies exhibit effectiveness in addressing single-choice questions, they encounter difficulties in handling multi-choice queries due to the heightened intricacy and informational density. In this paper, inspired by the human cognitive process of progressively excluding options, we propose a three-step Reverse Exclusion Graph-of-Thought (ReX-GoT) framework, including Option Exclusion, Error Analysis, and Combine Information. Specifically, our ReX-GoT mimics human reasoning by gradually excluding irrelevant options and learning the reasons …


Imitate The Good And Avoid The Bad: An Incremental Approach To Safe Reinforcement Learning, Minh Huy Hoang, Mai Anh Tien, Pradeep Varakantham Feb 2024

Imitate The Good And Avoid The Bad: An Incremental Approach To Safe Reinforcement Learning, Minh Huy Hoang, Mai Anh Tien, Pradeep Varakantham

Research Collection School Of Computing and Information Systems

A popular framework for enforcing safe actions in Reinforcement Learning (RL) is Constrained RL, where trajectory based constraints on expected cost (or other cost measures) are employed to enforce safety and more importantly these constraints are enforced while maximizing expected reward. Most recent approaches for solving Constrained RL convert the trajectory based cost constraint into a surrogate problem that can be solved using minor modifications to RL methods. A key drawback with such approaches is an over or underestimation of the cost constraint at each state. Therefore, we provide an approach that does not modify the trajectory based cost constraint …


Catnet: Cross-Modal Fusion For Audio-Visual Speech Recognition, Xingmei Wang, Jianchen Mi, Boquan Li, Yixu Zhao, Jiaxiang Meng Feb 2024

Catnet: Cross-Modal Fusion For Audio-Visual Speech Recognition, Xingmei Wang, Jianchen Mi, Boquan Li, Yixu Zhao, Jiaxiang Meng

Research Collection School Of Computing and Information Systems

Automatic speech recognition (ASR) is a typical pattern recognition technology that converts human speeches into texts. With the aid of advanced deep learning models, the performance of speech recognition is significantly improved. Especially, the emerging Audio–Visual Speech Recognition (AVSR) methods achieve satisfactory performance by combining audio-modal and visual-modal information. However, various complex environments, especially noises, limit the effectiveness of existing methods. In response to the noisy problem, in this paper, we propose a novel cross-modal audio–visual speech recognition model, named CATNet. First, we devise a cross-modal bidirectional fusion model to analyze the close relationship between audio and visual modalities. Second, …


When Evolutionary Computation Meets Privacy, Bowen Zhao, Wei-Neng Chen, Xiaoguo Li, Ximeng Liu, Qingqi Pei, Jun Zhang Feb 2024

When Evolutionary Computation Meets Privacy, Bowen Zhao, Wei-Neng Chen, Xiaoguo Li, Ximeng Liu, Qingqi Pei, Jun Zhang

Research Collection School Of Computing and Information Systems

Recently, evolutionary computation (EC) has experienced significant advancements due to the integration of machine learning, distributed computing, and big data technologies. These developments have led to new research avenues in EC, such as distributed EC and surrogate-assisted EC. While these advancements have greatly enhanced the performance and applicability of EC, they have also raised concerns regarding privacy leakages, specifically the disclosure of optimal results and surrogate models. Consequently, the combination of evolutionary computation and privacy protection becomes an increasing necessity. However, a comprehensive exploration of privacy concerns in evolutionary computation is currently lacking, particularly in terms of identifying the object, …


Hgprompt: Bridging Homogeneous And Heterogeneous Graphs For Few-Shot Prompt Learning, Xingtong Yu, Yuan Fang, Zemin Liu, Xinming Zhang Feb 2024

Hgprompt: Bridging Homogeneous And Heterogeneous Graphs For Few-Shot Prompt Learning, Xingtong Yu, Yuan Fang, Zemin Liu, Xinming Zhang

Research Collection School Of Computing and Information Systems

Graph neural networks (GNNs) and heterogeneous graph neural networks (HGNNs) are prominent techniques for homogeneous and heterogeneous graph representation learning, yet their performance in an end-to-end supervised framework greatly depends on the availability of task-specific supervision. To reduce the labeling cost, pre-training on selfsupervised pretext tasks has become a popular paradigm, but there is often a gap between the pre-trained model and downstream tasks, stemming from the divergence in their objectives. To bridge the gap, prompt learning has risen as a promising direction especially in few-shot settings, without the need to fully fine-tune the pre-trained model. While there has been …


Simple Image-Level Classification Improves Open-Vocabulary Object Detection, Ruohuan Fang, Guansong Pang, Xiao Bai Feb 2024

Simple Image-Level Classification Improves Open-Vocabulary Object Detection, Ruohuan Fang, Guansong Pang, Xiao Bai

Research Collection School Of Computing and Information Systems

Open-Vocabulary Object Detection (OVOD) aims to detect novel objects beyond a given set of base categories on which the detection model is trained. Recent OVOD methods focus on adapting the image-level pre-trained vision-language models (VLMs), such as CLIP, to a region-level object detection task via, eg., region-level knowledge distillation, regional prompt learning, or region-text pre-training, to expand the detection vocabulary. These methods have demonstrated remarkable performance in recognizing regional visual concepts, but they are weak in exploiting the VLMs' powerful global scene understanding ability learned from the billion-scale image-level text descriptions. This limits their capability in detecting hard objects of …


Leveraging Llms And Generative Models For Interactive Known-Item Video Search, Zhixin Ma, Jiaxin Wu, Chong-Wah Ngo Feb 2024

Leveraging Llms And Generative Models For Interactive Known-Item Video Search, Zhixin Ma, Jiaxin Wu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

While embedding techniques such as CLIP have considerably boosted search performance, user strategies in interactive video search still largely operate on a trial-and-error basis. Users are often required to manually adjust their queries and carefully inspect the search results, which greatly rely on the users’ capability and proficiency. Recent advancements in large language models (LLMs) and generative models offer promising avenues for enhancing interactivity in video retrieval and reducing the personal bias in query interpretation, particularly in the known-item search. Specifically, LLMs can expand and diversify the semantics of the queries while avoiding grammar mistakes or the language barrier. In …


Foodmask: Real-Time Food Instance Counting, Segmentation And Recognition, Huu-Thanh Nguyen, Yu Cao, Chong-Wah Ngo, Wing-Kwong Chan Feb 2024

Foodmask: Real-Time Food Instance Counting, Segmentation And Recognition, Huu-Thanh Nguyen, Yu Cao, Chong-Wah Ngo, Wing-Kwong Chan

Research Collection School Of Computing and Information Systems

Food computing has long been studied and deployed to several applications. Understanding a food image at the instance level, including recognition, counting and segmentation, is essential to quantifying nutrition and calorie consumption. Nevertheless, existing techniques are limited to either category-specific instance detection, which does not reflect precisely the instance size at the pixel level, or category-agnostic instance segmentation, which is insufficient for dish recognition. This paper presents a compact and fast multi-task network, namely FoodMask, for clustering-based food instance counting, segmentation and recognition. The network learns a semantic space simultaneously encoding food category distribution and instance height at pixel basis. …


Machine Learning For Refining Knowledge Graphs: A Survey, Budhitama Subagdja, D. Shanthoshigaa, Zhaoxia Wang, Ah-Hwee Tan Feb 2024

Machine Learning For Refining Knowledge Graphs: A Survey, Budhitama Subagdja, D. Shanthoshigaa, Zhaoxia Wang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Knowledge graph (KG) refinement refers to the process of filling in missing information, removing redundancies, and resolving inconsistencies in knowledge graphs. With the growing popularity of KG in various domains, many techniques involving machine learning have been applied, but there is no survey dedicated to machine learning-based KG refinement yet. Based on a novel framework following the KG refinement process, this paper presents a survey of machine learning approaches to KG refinement according to the kind of operations in KG refinement, the training datasets, mode of learning, and process multiplicity. Furthermore, the survey aims to provide broad practical insights into …


Glop: Learning Global Partition And Local Construction For Solving Large-Scale Routing Problems In Real-Time, Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, Fanzhang Li Feb 2024

Glop: Learning Global Partition And Local Construction For Solving Large-Scale Routing Problems In Real-Time, Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, Fanzhang Li

Research Collection School Of Computing and Information Systems

The recent end-to-end neural solvers have shown promise for small-scale routing problems but suffered from limited real-time scaling-up performance. This paper proposes GLOP (Global and Local Optimization Policies), a unified hierarchical framework that efficiently scales toward large-scale routing problems. GLOP partitions large routing problems into Travelling Salesman Problems (TSPs) and TSPs into Shortest Hamiltonian Path Problems. For the first time, we hybridize non-autoregressive neural heuristics for coarse-grained problem partitions and autoregressive neural heuristics for fine-grained route constructions, leveraging the scalability of the former and the meticulousness of the latter. Experimental results show that GLOP achieves competitive and state-of-the-art real-time performance …