Open Access. Powered by Scholars. Published by Universities.®

Programming Languages and Compilers Commons

Open Access. Powered by Scholars. Published by Universities.®

987 Full-Text Articles 1,451 Authors 433,267 Downloads 132 Institutions

All Articles in Programming Languages and Compilers

Faceted Search

987 full-text articles. Page 1 of 38.

What Does One Billion Dollars Look Like?: Visualizing Extreme Wealth, William Mahoney Luckman 2024 The Graduate Center, City University of New York

What Does One Billion Dollars Look Like?: Visualizing Extreme Wealth, William Mahoney Luckman

Dissertations, Theses, and Capstone Projects

The word “billion” is a mathematical abstraction related to “big,” but it is difficult to understand the vast difference in value between one million and one billion; even harder to understand the vast difference in purchasing power between one billion dollars, and the average U.S. yearly income. Perhaps most difficult to conceive of is what that purchasing power and huge mass of capital translates to in terms of power. This project blends design, text, facts, and figures into an interactive narrative website that helps the user better understand their position in relation to extreme wealth: https://whatdoesonebilliondollarslooklike.website/

The site incorporates …


Choosing A Sophisticated, Robust, And Secure Programming Language, J. Simon Richard 2023 Cleveland State University

Choosing A Sophisticated, Robust, And Secure Programming Language, J. Simon Richard

The Downtown Review

This paper explores which programming languages maximize the quality and efficiency of software development projects requiring high levels of sophistication, security, and stability. Of the four languages discussed in this paper—C, C++, Java, and Rust—we conclude that Rust is the best for this application.


Μakka: Mutation Testing For Actor Concurrency In Akka Using Real-World Bugs, Mohsen Moradi Moghadam, Mehdi Bagherzadeh, Raffi Takvor Khatchadourian Ph,D,, Hamid Bagheri 2023 Oakland University

Μakka: Mutation Testing For Actor Concurrency In Akka Using Real-World Bugs, Mohsen Moradi Moghadam, Mehdi Bagherzadeh, Raffi Takvor Khatchadourian Ph,D,, Hamid Bagheri

Publications and Research

Actor concurrency is becoming increasingly important in the real-world and mission-critical software. This requires these applications to be free from actor bugs, that occur in the real world, and have tests that are effective in finding these bugs. Mutation testing is a well-established technique that transforms an application to induce its likely bugs and evaluate the effectiveness of its tests in finding these bugs. Mutation testing is available for a broad spectrum of applications and their bugs, ranging from web to mobile to machine learning, and is used at scale in companies like Google and Facebook. However, there still is …


Hypothyroid Disease Analysis By Using Machine Learning, SANJANA SEELAM 2023 California State University, San Bernardino

Hypothyroid Disease Analysis By Using Machine Learning, Sanjana Seelam

Electronic Theses, Projects, and Dissertations

Thyroid illness frequently manifests as hypothyroidism. It is evident that people with hypothyroidism are primarily female. Because the majority of people are unaware of the illness, it is quickly becoming more serious. It is crucial to catch it early on so that medical professionals can treat it more effectively and prevent it from getting worse. Machine learning illness prediction is a challenging task. Disease prediction is aided greatly by machine learning. Once more, unique feature selection strategies have made the process of disease assumption and prediction easier. To properly monitor and cure this illness, accurate detection is essential. In order …


Random Variable Spaces: Mathematical Properties And An Extension To Programming Computable Functions, Mohammed Kurd-Misto 2023 Chapman University

Random Variable Spaces: Mathematical Properties And An Extension To Programming Computable Functions, Mohammed Kurd-Misto

Computational and Data Sciences (PhD) Dissertations

This dissertation aims to extend the boundaries of Programming Computable Functions (PCF) by introducing a novel collection of categories referred to as Random Variable Spaces. Originating as a generalization of Quasi-Borel Spaces, Random Variable Spaces are rigorously defined as categories where objects are sets paired with a collection of random variables from an underlying measurable space. These spaces offer a theoretical foundation for extending PCF to natively handle stochastic elements.

The dissertation is structured into seven chapters that provide a multi-disciplinary background, from PCF and Measure Theory to Category Theory with special attention to Monads and the Giry Monad. The …


A Comprehensive Evaluation Of Large Language Models On Legal Judgment Prediction, Ruihao SHUI, Yixin CAO, Xiang WANG, Tat-Seng CHUA 2023 Singapore Management University

A Comprehensive Evaluation Of Large Language Models On Legal Judgment Prediction, Ruihao Shui, Yixin Cao, Xiang Wang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Large language models (LLMs) have demonstrated great potential for domain-specific applications, such as the law domain. However, recent disputes over GPT-4’s law evaluation raise questions concerning their performance in real-world legal tasks. To systematically investigate their competency in the law, we design practical baseline solutions based on LLMs and test on the task of legal judgment prediction. In our solutions, LLMs can work alone to answer open questions or coordinate with an information retrieval (IR) system to learn from similar cases or solve simplified multi-choice questions. We show that similar cases and multi-choice options, namely label candidates, included in prompts …


Examining The Inter-Consistency Of Large Language Models: An In-Depth Analysis Via Debate, Kai XIONG, Xiao DING, Yixin CAO, Ting LIU, Bing QIN 2023 Singapore Management University

Examining The Inter-Consistency Of Large Language Models: An In-Depth Analysis Via Debate, Kai Xiong, Xiao Ding, Yixin Cao, Ting Liu, Bing Qin

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs) have shown impressive capabilities in various applications, but they still face various inconsistency issues. Existing works primarily focus on the inconsistency issues within a single LLM, while we complementarily explore the inter-consistency among multiple LLMs for collaboration. To examine whether LLMs can collaborate effectively to achieve a consensus for a shared goal, we focus on commonsense reasoning, and introduce a formal debate framework (FORD) to conduct a three-stage debate among LLMs with real-world scenarios alignment: fair debate, mismatched debate, and roundtable debate. Through extensive experiments on various datasets, LLMs can effectively collaborate to reach a consensus …


Large Language Model Is Not A Good Few-Shot Information Extractor, But A Good Reranker For Hard Samples!, Yubo MA, Yixin CAO, YongChin HONG, Aixin SUN 2023 Singapore Management University

Large Language Model Is Not A Good Few-Shot Information Extractor, But A Good Reranker For Hard Samples!, Yubo Ma, Yixin Cao, Yongchin Hong, Aixin Sun

Research Collection School Of Computing and Information Systems

Large Language Models (LLMs) have made remarkable strides in various tasks. However, whether they are competitive few-shot solvers for information extraction (IE) tasks and surpass fine-tuned small Pre-trained Language Models (SLMs) remains an open problem. This paper aims to provide a thorough answer to this problem, and moreover, to explore an approach towards effective and economical IE systems that combine the strengths of LLMs and SLMs. Through extensive experiments on nine datasets across four IE tasks, we show that LLMs are not effective few-shot information extractors in general, given their unsatisfactory performance in most settings and the high latency and …


Disentangling Transformer Language Models As Superposed Topic Models, Jia Peng LIM, Hady Wirawan LAUW 2023 Singapore Management University

Disentangling Transformer Language Models As Superposed Topic Models, Jia Peng Lim, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based Language Models (TLM) similarly consist of decoder weights. However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple …


Llm-Adapters: An Adapter Family For Parameter-Efficient Fine-Tuning Of Large Language Models, Zhiqiang HU, Lei WANG, Yihuai LAN, Wanyu XU, Ee-peng LIM, Lidong BING, Xing XU, Soujanya PORIA, Roy Ka-Wei LEE 2023 Singapore Management University

Llm-Adapters: An Adapter Family For Parameter-Efficient Fine-Tuning Of Large Language Models, Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, Roy Ka-Wei Lee

Research Collection School Of Computing and Information Systems

The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e.g., ChatDoctor) or instruction data (e.g., Alpaca). Among the various fine-tuning methods, adapter-based parameter-efficient fine-tuning (PEFT) is undoubtedly one of the most attractive topics, as it only requires fine-tuning a few external parameters instead of the entire LLMs while achieving comparable or even better performance. To enable further research on PEFT methods of LLMs, this paper presents LLMAdapters, an easy-to-use framework that integrates various adapters into LLMs and …


Benchmarking Foundation Models With Language-Model-As-An-Examiner, Yushi BAI, Jiahao YING, Yixin CAO, Xin LV, Yuze HE, Xiaozhi WANG, Jifan YU, Kaisheng ZENG, Yijia XIAO, Haozhe LYU, Jiayin ZHANG, Juanzi LI, Lei HOU 2023 Singapore Management University

Benchmarking Foundation Models With Language-Model-As-An-Examiner, Yushi Bai, Jiahao Ying, Yixin Cao, Xin Lv, Yuze He, Xiaozhi Wang, Jifan Yu, Kaisheng Zeng, Yijia Xiao, Haozhe Lyu, Jiayin Zhang, Juanzi Li, Lei Hou

Research Collection School Of Computing and Information Systems

Numerous benchmarks have been established to assess the performance of foundation models on open-ended question answering, which serves as a comprehensive test of a model’s ability to understand and generate language in a manner similar to humans. Most of these works focus on proposing new datasets, however, we see two main issues within previous benchmarking pipelines, namely testing leakage and evaluation automation. In this paper, we propose a novel benchmarking framework, Language-Model-as-an-Examiner, where the LM serves as a knowledgeable examiner that formulates questions based on its knowledge and evaluates responses in a reference-free manner. Our framework allows for effortless extensibility …


Robust Prompt Optimization For Large Language Models Against Distribution Shifts, Moxin LI, Wenjie WANG, Fuli FENG, Yixin CAO, Jizhi ZHANG, Tat-Seng CHUA 2023 Singapore Management University

Robust Prompt Optimization For Large Language Models Against Distribution Shifts, Moxin Li, Wenjie Wang, Fuli Feng, Yixin Cao, Jizhi Zhang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks. However, their effectiveness is highly dependent on the phrasing of the task prompt, leading to research on automatic prompt optimization using labeled task data. We reveal that these prompt optimization techniques are vulnerable to distribution shifts such as subpopulation shifts, which are common for LLMs in real-world scenarios such as customer reviews analysis. In this light, we propose a new problem of robust prompt optimization for LLMs against distribution shifts, which requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled …


Molca: Molecular Graph-Language Modeling With Cross-Modal Projector And Uni-Modal Adapter, Zhiyuan LIU, Sihang LI, Yanchen LUO, Hao FEI, Yixin CAO, Kenji KAWAGUCHI, Xiang WANG, Tat-Seng CHUA 2023 Singapore Management University

Molca: Molecular Graph-Language Modeling With Cross-Modal Projector And Uni-Modal Adapter, Zhiyuan Liu, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Language Models (LMs) have demonstrated impressive molecule understanding ability on various 1D text-related tasks. However, they inherently lack 2D graph perception — a critical ability of human professionals in comprehending molecules’ topological structures. To bridge this gap, we propose MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter. MolCA enables an LM (i.e., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector. Specifically, the cross-modal projector is implemented as a QFormer to connect a graph encoder’s representation space and an LM’s text space. Further, MolCA employs a uni-modal adapter (i.e., LoRA) for the LM’s efficient …


A Black-Box Attack On Code Models Via Representation Nearest Neighbor Search, Jie ZHANG, Wei MA, Qiang HU, Shangqing Liu, Xiaofei XIE, Yves LE Traon, Yang LIU 2023 Singapore Management University

A Black-Box Attack On Code Models Via Representation Nearest Neighbor Search, Jie Zhang, Wei Ma, Qiang Hu, Shangqing Liu, Xiaofei Xie, Yves Le Traon, Yang Liu

Research Collection School Of Computing and Information Systems

Existing methods for generating adversarial code examples face several challenges: limted availability of substitute variables, high verification costs for these substitutes, and the creation of adversarial samples with noticeable perturbations. To address these concerns, our proposed approach, RNNS, uses a search seed based on historical attacks to find potential adversarial substitutes. Rather than directly using the discrete substitutes, they are mapped to a continuous vector space using a pre-trained variable name encoder. Based on the vector representation, RNNS predicts and selects better substitutes for attacks. We evaluated the performance of RNNS across six coding tasks encompassing three programming languages: Java, …


Towards Llm-Based Fact Verification On News Claims With A Hierarchical Step-By-Step Prompting Method, Xuan ZHANG, Wei GAO 2023 Singapore Management University

Towards Llm-Based Fact Verification On News Claims With A Hierarchical Step-By-Step Prompting Method, Xuan Zhang, Wei Gao

Research Collection School Of Computing and Information Systems

While large pre-trained language models (LLMs) have shown their impressive capabilities in various NLP tasks, they are still underexplored in the misinformation domain. In this paper, we examine LLMs with in-context learning (ICL) for news claim verification, and find that only with 4-shot demonstration examples, the performance of several prompting methods can be comparable with previous supervised models. To further boost performance, we introduce a Hierarchical Step-by-Step (HiSS) prompting method which directs LLMs to separate a claim into several subclaims and then verify each of them via multiple questionsanswering steps progressively. Experiment results on two public misinformation datasets show that …


Hallucination Detection: Robustly Discerning Reliable Answers In Large Language Models, Yuyuan CHEN, Qiang FU, Yichen YUAN, Zhihao WEN, Ge FAN, Dayiheng LIU, Dongmei ZHANG, Zhixu LI, Yanghua XIAO 2023 Singapore Management University

Hallucination Detection: Robustly Discerning Reliable Answers In Large Language Models, Yuyuan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, Yanghua Xiao

Research Collection School Of Computing and Information Systems

Large language models (LLMs) have gained widespread adoption in various natural language processing tasks, including question answering and dialogue systems. However, a major drawback of LLMs is the issue of hallucination, where they generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences. In this paper, we propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers. RelD is trained on the constructed RelQA, a bilingual question-answering dialogue dataset along with answers generated by LLMs and a comprehensive set of metrics. Our experimental results demonstrate that the proposed RelD successfully detects …


Towards Safe Automated Refactoring Of Imperative Deep Learning Programs To Graph Execution, Raffi Takvor Khatchadourian Ph.D., Tatiana Castro Vélez, Mehdi Bagherzadeh, Nan Jia, Anita Raja 2023 CUNY Hunter College

Towards Safe Automated Refactoring Of Imperative Deep Learning Programs To Graph Execution, Raffi Takvor Khatchadourian Ph.D., Tatiana Castro Vélez, Mehdi Bagherzadeh, Nan Jia, Anita Raja

Publications and Research

Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code—supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, imperative DL frameworks encouraging eager execution have emerged at the expense of run-time performance. Though hybrid approaches aim for the “best of both worlds,” using them effectively requires subtle considerations to make code amenable to safe, accurate, and efficient graph execution. We present our ongoing work on automated refactoring that assists developers in specifying whether …


Towards Safe Automated Refactoring Of Imperative Deep Learning Programs To Graph Execution, Raffi T. Khatchadourian Ph,D,, Tatiana Castro Vélez, Mehdi Bagherzadeh, Nan Jia, Anita Raja 2023 CUNY Hunter College

Towards Safe Automated Refactoring Of Imperative Deep Learning Programs To Graph Execution, Raffi T. Khatchadourian Ph,D,, Tatiana Castro Vélez, Mehdi Bagherzadeh, Nan Jia, Anita Raja

Publications and Research

Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code—supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, imperative DL frameworks encouraging eager execution have emerged at the expense of run-time performance. Though hybrid approaches aim for the "best of both worlds," using them effectively requires subtle considerations to make code amenable to safe, accurate, and efficient graph execution. We present our ongoing work on automated refactoring that assists developers in specifying whether …


Are We Ready To Embrace Generative Ai For Software Q&A?, Bowen XU, Thanh-Dat NGUYEN, Thanh Le CONG, Thong HOANG, Jiakun LIU, Kisub KIM, Chen GONG, Changan NIU, Chenyu WANG, Xuan-Bach Dinh LE, David LO 2023 Singapore Management University

Are We Ready To Embrace Generative Ai For Software Q&A?, Bowen Xu, Thanh-Dat Nguyen, Thanh Le Cong, Thong Hoang, Jiakun Liu, Kisub Kim, Chen Gong, Changan Niu, Chenyu Wang, Xuan-Bach Dinh Le, David Lo

Research Collection School Of Computing and Information Systems

Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. The main reason provided by the official Stack Overflow is that the answers generated by ChatGPT are of low quality. To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers. Our methodology employs both automatic comparison and a manual study. Our results suggest that human-written and ChatGPT-generated answers are semantically similar, however, human-written answers outperform ChatGPT-generated ones consistently across multiple aspects, …


K-St: A Formal Executable Semantics Of The Structured Text Language For Plcs, Kun WANG, Jingyi WANG, Christopher M. POSKITT, Xiangxiang CHEN, Jun SUN, Peng CHENG 2023 Singapore Management University

K-St: A Formal Executable Semantics Of The Structured Text Language For Plcs, Kun Wang, Jingyi Wang, Christopher M. Poskitt, Xiangxiang Chen, Jun Sun, Peng Cheng

Research Collection School Of Computing and Information Systems

Programmable Logic Controllers (PLCs) are responsible for automating process control in many industrial systems (e.g. in manufacturing and public infrastructure), and thus it is critical to ensure that they operate correctly and safely. The majority of PLCs are programmed in languages such as Structured Text (ST). However, a lack of formal semantics makes it difficult to ascertain the correctness of their translators and compilers, which vary from vendor-to-vendor. In this work, we develop K-ST, a formal executable semantics for ST in the K framework. Defined with respect to the IEC 61131-3 standard and PLC vendor manuals, K-ST is a high-level …


Digital Commons powered by bepress