Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 22 of 22

Full-Text Articles in Physical Sciences and Mathematics

A Black-Box Attack On Code Models Via Representation Nearest Neighbor Search, Jie Zhang, Wei Ma, Qiang Hu, Shangqing Liu, Xiaofei Xie, Yves Le Traon, Yang Liu Dec 2023

A Black-Box Attack On Code Models Via Representation Nearest Neighbor Search, Jie Zhang, Wei Ma, Qiang Hu, Shangqing Liu, Xiaofei Xie, Yves Le Traon, Yang Liu

Research Collection School Of Computing and Information Systems

Existing methods for generating adversarial code examples face several challenges: limted availability of substitute variables, high verification costs for these substitutes, and the creation of adversarial samples with noticeable perturbations. To address these concerns, our proposed approach, RNNS, uses a search seed based on historical attacks to find potential adversarial substitutes. Rather than directly using the discrete substitutes, they are mapped to a continuous vector space using a pre-trained variable name encoder. Based on the vector representation, RNNS predicts and selects better substitutes for attacks. We evaluated the performance of RNNS across six coding tasks encompassing three programming languages: Java, …


Benchmarking Foundation Models With Language-Model-As-An-Examiner, Yushi Bai, Jiahao Ying, Yixin Cao, Xin Lv, Yuze He, Xiaozhi Wang, Jifan Yu, Kaisheng Zeng, Yijia Xiao, Haozhe Lyu, Jiayin Zhang, Juanzi Li, Lei Hou Dec 2023

Benchmarking Foundation Models With Language-Model-As-An-Examiner, Yushi Bai, Jiahao Ying, Yixin Cao, Xin Lv, Yuze He, Xiaozhi Wang, Jifan Yu, Kaisheng Zeng, Yijia Xiao, Haozhe Lyu, Jiayin Zhang, Juanzi Li, Lei Hou

Research Collection School Of Computing and Information Systems

Numerous benchmarks have been established to assess the performance of foundation models on open-ended question answering, which serves as a comprehensive test of a model’s ability to understand and generate language in a manner similar to humans. Most of these works focus on proposing new datasets, however, we see two main issues within previous benchmarking pipelines, namely testing leakage and evaluation automation. In this paper, we propose a novel benchmarking framework, Language-Model-as-an-Examiner, where the LM serves as a knowledgeable examiner that formulates questions based on its knowledge and evaluates responses in a reference-free manner. Our framework allows for effortless extensibility …


Molca: Molecular Graph-Language Modeling With Cross-Modal Projector And Uni-Modal Adapter, Zhiyuan Liu, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua Dec 2023

Molca: Molecular Graph-Language Modeling With Cross-Modal Projector And Uni-Modal Adapter, Zhiyuan Liu, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Language Models (LMs) have demonstrated impressive molecule understanding ability on various 1D text-related tasks. However, they inherently lack 2D graph perception — a critical ability of human professionals in comprehending molecules’ topological structures. To bridge this gap, we propose MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter. MolCA enables an LM (i.e., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector. Specifically, the cross-modal projector is implemented as a QFormer to connect a graph encoder’s representation space and an LM’s text space. Further, MolCA employs a uni-modal adapter (i.e., LoRA) for the LM’s efficient …


Llm-Adapters: An Adapter Family For Parameter-Efficient Fine-Tuning Of Large Language Models, Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, Roy Ka-Wei Lee Dec 2023

Llm-Adapters: An Adapter Family For Parameter-Efficient Fine-Tuning Of Large Language Models, Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, Roy Ka-Wei Lee

Research Collection School Of Computing and Information Systems

The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e.g., ChatDoctor) or instruction data (e.g., Alpaca). Among the various fine-tuning methods, adapter-based parameter-efficient fine-tuning (PEFT) is undoubtedly one of the most attractive topics, as it only requires fine-tuning a few external parameters instead of the entire LLMs while achieving comparable or even better performance. To enable further research on PEFT methods of LLMs, this paper presents LLMAdapters, an easy-to-use framework that integrates various adapters into LLMs and …


Examining The Inter-Consistency Of Large Language Models: An In-Depth Analysis Via Debate, Kai Xiong, Xiao Ding, Yixin Cao, Ting Liu, Bing Qin Dec 2023

Examining The Inter-Consistency Of Large Language Models: An In-Depth Analysis Via Debate, Kai Xiong, Xiao Ding, Yixin Cao, Ting Liu, Bing Qin

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs) have shown impressive capabilities in various applications, but they still face various inconsistency issues. Existing works primarily focus on the inconsistency issues within a single LLM, while we complementarily explore the inter-consistency among multiple LLMs for collaboration. To examine whether LLMs can collaborate effectively to achieve a consensus for a shared goal, we focus on commonsense reasoning, and introduce a formal debate framework (FORD) to conduct a three-stage debate among LLMs with real-world scenarios alignment: fair debate, mismatched debate, and roundtable debate. Through extensive experiments on various datasets, LLMs can effectively collaborate to reach a consensus …


Large Language Model Is Not A Good Few-Shot Information Extractor, But A Good Reranker For Hard Samples!, Yubo Ma, Yixin Cao, Yongchin Hong, Aixin Sun Dec 2023

Large Language Model Is Not A Good Few-Shot Information Extractor, But A Good Reranker For Hard Samples!, Yubo Ma, Yixin Cao, Yongchin Hong, Aixin Sun

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs) have made remarkable strides in various tasks. However, whether they are competitive few-shot solvers for information extraction (IE) tasks and surpass fine-tuned small Pre-trained Language Models (SLMs) remains an open problem. This paper aims to provide a thorough answer to this problem, and moreover, to explore an approach towards effective and economical IE systems that combine the strengths of LLMs and SLMs. Through extensive experiments on nine datasets across four IE tasks, we show that LLMs are not effective few-shot information extractors in general, given their unsatisfactory performance in most settings and the high latency and …


Disentangling Transformer Language Models As Superposed Topic Models, Jia Peng Lim, Hady Wirawan Lauw Dec 2023

Disentangling Transformer Language Models As Superposed Topic Models, Jia Peng Lim, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based Language Models (TLM) similarly consist of decoder weights. However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple …


A Comprehensive Evaluation Of Large Language Models On Legal Judgment Prediction, Ruihao Shui, Yixin Cao, Xiang Wang, Tat-Seng Chua Dec 2023

A Comprehensive Evaluation Of Large Language Models On Legal Judgment Prediction, Ruihao Shui, Yixin Cao, Xiang Wang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Large language models (LLMs) have demonstrated great potential for domain-specific applications, such as the law domain. However, recent disputes over GPT-4’s law evaluation raise questions concerning their performance in real-world legal tasks. To systematically investigate their competency in the law, we design practical baseline solutions based on LLMs and test on the task of legal judgment prediction. In our solutions, LLMs can work alone to answer open questions or coordinate with an information retrieval (IR) system to learn from similar cases or solve simplified multi-choice questions. We show that similar cases and multi-choice options, namely label candidates, included in prompts …


Wsdms: Debunk Fake News Via Weakly Supervised Detection Of Misinforming Sentences With Contextualized Social Wisdom, Ruichao Yang, Wei Gao, Jing Ma, Hongzhan Lin, Zhiwei Yang Dec 2023

Wsdms: Debunk Fake News Via Weakly Supervised Detection Of Misinforming Sentences With Contextualized Social Wisdom, Ruichao Yang, Wei Gao, Jing Ma, Hongzhan Lin, Zhiwei Yang

Research Collection School Of Computing and Information Systems

In recent years, we witness the explosion of false and unconfirmed information (i.e., rumors) that went viral on social media and shocked the public. Rumors can trigger versatile, mostly controversial stance expressions among social media users. Rumor verification and stance detection are different yet relevant tasks. Fake news debunking primarily focuses on determining the truthfulness of news articles, which oversimplifies the issue as fake news often combines elements of both truth and falsehood. Thus, it becomes crucial to identify specific instances of misinformation within the articles. In this research, we investigate a novel task in the field of fake news …


Towards Llm-Based Fact Verification On News Claims With A Hierarchical Step-By-Step Prompting Method, Xuan Zhang, Wei Gao Nov 2023

Towards Llm-Based Fact Verification On News Claims With A Hierarchical Step-By-Step Prompting Method, Xuan Zhang, Wei Gao

Research Collection School Of Computing and Information Systems

While large pre-trained language models (LLMs) have shown their impressive capabilities in various NLP tasks, they are still underexplored in the misinformation domain. In this paper, we examine LLMs with in-context learning (ICL) for news claim verification, and find that only with 4-shot demonstration examples, the performance of several prompting methods can be comparable with previous supervised models. To further boost performance, we introduce a Hierarchical Step-by-Step (HiSS) prompting method which directs LLMs to separate a claim into several subclaims and then verify each of them via multiple questionsanswering steps progressively. Experiment results on two public misinformation datasets show that …


Hallucination Detection: Robustly Discerning Reliable Answers In Large Language Models, Yuyuan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, Yanghua Xiao Oct 2023

Hallucination Detection: Robustly Discerning Reliable Answers In Large Language Models, Yuyuan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, Yanghua Xiao

Research Collection School Of Computing and Information Systems

Large language models (LLMs) have gained widespread adoption in various natural language processing tasks, including question answering and dialogue systems. However, a major drawback of LLMs is the issue of hallucination, where they generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences. In this paper, we propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers. RelD is trained on the constructed RelQA, a bilingual question-answering dialogue dataset along with answers generated by LLMs and a comprehensive set of metrics. Our experimental results demonstrate that the proposed RelD successfully detects …


Are We Ready To Embrace Generative Ai For Software Q&A?, Bowen Xu, Thanh-Dat Nguyen, Thanh Le Cong, Thong Hoang, Jiakun Liu, Kisub Kim, Chen Gong, Changan Niu, Chenyu Wang, Xuan-Bach Dinh Le, David Lo Sep 2023

Are We Ready To Embrace Generative Ai For Software Q&A?, Bowen Xu, Thanh-Dat Nguyen, Thanh Le Cong, Thong Hoang, Jiakun Liu, Kisub Kim, Chen Gong, Changan Niu, Chenyu Wang, Xuan-Bach Dinh Le, David Lo

Research Collection School Of Computing and Information Systems

Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. The main reason provided by the official Stack Overflow is that the answers generated by ChatGPT are of low quality. To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers. Our methodology employs both automatic comparison and a manual study. Our results suggest that human-written and ChatGPT-generated answers are semantically similar, however, human-written answers outperform ChatGPT-generated ones consistently across multiple aspects, …


K-St: A Formal Executable Semantics Of The Structured Text Language For Plcs, Kun Wang, Jingyi Wang, Christopher M. Poskitt, Xiangxiang Chen, Jun Sun, Peng Cheng Sep 2023

K-St: A Formal Executable Semantics Of The Structured Text Language For Plcs, Kun Wang, Jingyi Wang, Christopher M. Poskitt, Xiangxiang Chen, Jun Sun, Peng Cheng

Research Collection School Of Computing and Information Systems

Programmable Logic Controllers (PLCs) are responsible for automating process control in many industrial systems (e.g. in manufacturing and public infrastructure), and thus it is critical to ensure that they operate correctly and safely. The majority of PLCs are programmed in languages such as Structured Text (ST). However, a lack of formal semantics makes it difficult to ascertain the correctness of their translators and compilers, which vary from vendor-to-vendor. In this work, we develop K-ST, a formal executable semantics for ST in the K framework. Defined with respect to the IEC 61131-3 standard and PLC vendor manuals, K-ST is a high-level …


Decompiling X86 Deep Neural Network Executables, Zhibo Liu, Yuanyuan Yuan, Shuai Wang, Xiaofei Xie, Lei Ma Aug 2023

Decompiling X86 Deep Neural Network Executables, Zhibo Liu, Yuanyuan Yuan, Shuai Wang, Xiaofei Xie, Lei Ma

Research Collection School Of Computing and Information Systems

Due to their widespread use on heterogeneous hardware devices, deep learning (DL) models are compiled into executables by DL compilers to fully leverage low-level hardware primitives. This approach allows DL computations to be undertaken at low cost across a variety of computing platforms, including CPUs, GPUs, and various hardware accelerators. We present BTD (Bin to DNN), a decompiler for deep neural network (DNN) executables. BTD takes DNN executables and outputs full model specifications, including types of DNN operators, network topology, dimensions, and parameters that are (nearly) identical to those of the input models. BTD delivers a practical framework to process …


Machine-Learning Approach To Automated Doubt Identification On Stack Overflow Comments To Guide Programming Learners, Tianhao Chen, Eng Lieh Ouh, Kar Way Tan, Siaw Ling Lo Jul 2023

Machine-Learning Approach To Automated Doubt Identification On Stack Overflow Comments To Guide Programming Learners, Tianhao Chen, Eng Lieh Ouh, Kar Way Tan, Siaw Ling Lo

Research Collection School Of Computing and Information Systems

Stack Overflow is a popular Q&A platform for developers to find solutions to programming problems. However, due to the varying quality of user-generated answers, there is a need for ways to help users find high-quality answers. While Stack Overflow's community-based approach can be effective, important technical aspects of the answer need to be captured, and users’ comments might contain doubts regarding these aspects. In this paper, we showed the feasibility of using a machine learning model to identify doubts and conducted data analysis. We found that highly reputed users tend to raise more doubts; most answers have doubt in the …


Chatgpt, Can You Generate Solutions For My Coding Exercises? An Evaluation On Its Effectiveness In An Undergraduate Java Programming Course, Eng Lieh Ouh, Benjamin Gan, Kyong Jin Shim, Swavek Wlodkowski Jul 2023

Chatgpt, Can You Generate Solutions For My Coding Exercises? An Evaluation On Its Effectiveness In An Undergraduate Java Programming Course, Eng Lieh Ouh, Benjamin Gan, Kyong Jin Shim, Swavek Wlodkowski

Research Collection School Of Computing and Information Systems

In this study, we assess the efficacy of employing the ChatGPT language model to generate solutions for coding exercises within an undergraduate Java programming course. ChatGPT, a large-scale, deep learning-driven natural language processing model, is capable of producing programming code based on textual input. Our evaluation involves analyzing ChatGPT-generated solutions for 80 diverse programming exercises and comparing them to the correct solutions. Our findings indicate that ChatGPT accurately generates Java programming solutions, which are characterized by high readability and well-structured organization. Additionally, the model can produce alternative, memory-efficient solutions. However, as a natural language processing model, ChatGPT struggles with coding …


Binalign: Alignment Padding Based Compiler Provenance Recovery, Maliha Ismail, Yan Lin, Donggyun Han, Debin Gao Jul 2023

Binalign: Alignment Padding Based Compiler Provenance Recovery, Maliha Ismail, Yan Lin, Donggyun Han, Debin Gao

Research Collection School Of Computing and Information Systems

Compiler provenance is significant in investigating the source-level indicators of binary code, like development-environment, source compiler, and optimization settings. Not only does compiler provenance analysis have important security applications in malware and vulnerability analysis, but it is also very challenging to extract useful artifacts from binary when high-level language constructs are missing. Previous works applied machine-learning techniques to predict the source compiler of binaries. However, most of the work is done on the binaries compiled on Linux operating system. We highlight the importance and need to explore Windows compilers and the complicated binaries compiled on the latest versions of these …


Safe Mdp Planning By Learning Temporal Patterns Of Undesirable Trajectories And Averting Negative Side Effects, Siow Meng Low, Akshat Kumar, Scott Sanner Jul 2023

Safe Mdp Planning By Learning Temporal Patterns Of Undesirable Trajectories And Averting Negative Side Effects, Siow Meng Low, Akshat Kumar, Scott Sanner

Research Collection School Of Computing and Information Systems

In safe MDP planning, a cost function based on the current state and action is often used to specify safety aspects. In real world, often the state representation used may lack sufficient fidelity to specify such safety constraints. Operating based on an incomplete model can often produce unintended negative side effects (NSEs). To address these challenges, first, we associate safety signals with state-action trajectories (rather than just immediate state-action). This makes our safety model highly general. We also assume categorical safety labels are given for different trajectories, rather than a numerical cost function, which is harder to specify by the …


Plan-And-Solve Prompting: Improving Zero-Shot Chain-Of-Thought Reasoning By Large Language Models, Lei Wang, Wanyu Xu, Yihuai Lan, Zhiqiang Hu, Yunshi Lan, Roy Ka-Wei Lee, Ee-Peng Lim Jul 2023

Plan-And-Solve Prompting: Improving Zero-Shot Chain-Of-Thought Reasoning By Large Language Models, Lei Wang, Wanyu Xu, Yihuai Lan, Zhiqiang Hu, Yunshi Lan, Roy Ka-Wei Lee, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Large language models (LLMs) have recently been shown to deliver impressive performance in various NLP tasks. To tackle multi-step reasoning tasks, few-shot chain-of-thought (CoT) prompting includes a few manually crafted step-by-step reasoning demonstrations which enable LLMs to explicitly generate reasoning steps and improve their reasoning task accuracy. To eliminate the manual effort, Zeroshot-CoT concatenates the target problem statement with “Let’s think step by step” as an input prompt to LLMs. Despite the success of Zero-shot-CoT, it still suffers from three pitfalls: calculation errors, missing-step errors, and semantic misunderstanding errors. To address the missing-step errors, we propose Planand-Solve (PS) Prompting. It …


Dynamic Police Patrol Scheduling With Multi-Agent Reinforcement Learning, Songhan Wong, Waldy Joe, Hoong Chuin Lau Jun 2023

Dynamic Police Patrol Scheduling With Multi-Agent Reinforcement Learning, Songhan Wong, Waldy Joe, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Effective police patrol scheduling is essential in projecting police presence and ensuring readiness in responding to unexpected events in urban environments. However, scheduling patrols can be a challenging task as it requires balancing between two conflicting objectives namely projecting presence (proactive patrol) and incident response (reactive patrol). This task is made even more challenging with the fact that patrol schedules do not remain static as occurrences of dynamic incidents can disrupt the existing schedules. In this paper, we propose a solution to this problem using Multi-Agent Reinforcement Learning (MARL) to address the Dynamic Bi-objective Police Patrol Dispatching and Rescheduling Problem …


Code Will Tell: Visual Identification Of Ponzi Schemes On Ethereum, Xiaolin Wen, Kim Siang Yeo, Yong Wang, Ling Cheng, Feida Zhu, Min Zhu Apr 2023

Code Will Tell: Visual Identification Of Ponzi Schemes On Ethereum, Xiaolin Wen, Kim Siang Yeo, Yong Wang, Ling Cheng, Feida Zhu, Min Zhu

Research Collection School Of Computing and Information Systems

Ethereum has become a popular blockchain with smart contracts for investors nowadays. Due to the decentralization and anonymity of Ethereum, Ponzi schemes have been easily deployed and caused significant losses to investors. However, there are still no explainable and effective methods to help investors easily identify Ponzi schemes and validate whether a smart contract is actually a Ponzi scheme. To fill the research gap, we propose PonziLens, a novel visualization approach to help investors achieve early identification of Ponzi schemes by investigating the operation codes of smart contracts. Specifically, we conduct symbolic execution of opcode and extract the control flow …


Contrastive Learning Approach To Word-In-Context Task For Low-Resource Languages, Pei-Chi Lo, Yang-Yin Lee, Hsien-Hao Chen, Agus Trisnajaya Kwee, Ee-Peng Lim Feb 2023

Contrastive Learning Approach To Word-In-Context Task For Low-Resource Languages, Pei-Chi Lo, Yang-Yin Lee, Hsien-Hao Chen, Agus Trisnajaya Kwee, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Word in context (WiC) task aims to determine whether a target word’s occurrences in two sentences share the same sense. In this paper, we propose a Contrastive Learning WiC (CLWiC) framework to improve the learning of sentence/word representations and classification of target word senses in the sentence pair when performing WiC on lowresource languages. In representation learning, CLWiC trains a pre-trained language model’s ability to cope with lowresource languages using both unsupervised and supervised contrastive learning. The WiC classifier learning further finetunes the language model with WiC classification loss under two classifier architecture options, SGBERT and WiSBERT, which use single-encoder …