Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Physical Sciences and Mathematics

Choosing A Sophisticated, Robust, And Secure Programming Language, J. Simon Richard Dec 2023

Choosing A Sophisticated, Robust, And Secure Programming Language, J. Simon Richard

The Downtown Review

This paper explores which programming languages maximize the quality and efficiency of software development projects requiring high levels of sophistication, security, and stability. Of the four languages discussed in this paper—C, C++, Java, and Rust—we conclude that Rust is the best for this application.


Supporting Software Engineers With Large Language Model-Based Automation, Ting Zhang Dec 2023

Supporting Software Engineers With Large Language Model-Based Automation, Ting Zhang

Dissertations and Theses Collection (Open Access)

In recent years, software engineering (SE) has witnessed significant growth, leading to the creation and sharing of an abundance of software artifacts such as source code, bug reports, and pull requests. Analyzing these artifacts is crucial for comprehending the sentiments of software developers and automating various SE tasks, ultimately leading to more human-centered automated SE and enhancing software development efficiency. However, the diverse and unstructured nature of software text poses a significant challenge to this analysis. In response, researchers have investigated a variety of approaches, including the utilization of natural language processing techniques. The advent of large language models (LLMs), …


Do Contributing Files Provide Information About Oss Newcomers' Onboarding Barriers?, Felipe Fronchetti, David Shepherd, Igor Wiese, Christoph Treude, Marco Gerosa, Igor Steinmacher Dec 2023

Do Contributing Files Provide Information About Oss Newcomers' Onboarding Barriers?, Felipe Fronchetti, David Shepherd, Igor Wiese, Christoph Treude, Marco Gerosa, Igor Steinmacher

Research Collection School Of Computing and Information Systems

Effectively onboarding newcomers is essential for the success of open source projects. These projects often provide onboarding guidelines in their ‘CONTRIBUTING’ files (e.g., CONTRIBUTING.md on GitHub). These files explain, for example, how to find open tasks, implement solutions, and submit code for review. However, these files often do not follow a standard structure, can be too large, and miss barriers commonly found by newcomers. In this paper, we propose an automated approach to parse these CONTRIBUTING files and assess how they address onboarding barriers. We manually classified a sample of files according to a model of onboarding barriers from the …


Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein Oct 2023

Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein

Research Collection School Of Computing and Information Systems

The automation of an increasingly large number of software engineering tasks is becoming possible thanks to Machine Learning (ML). One foundational building block in the application of ML to software artifacts is the representation of these artifacts ( e.g. , source code or executable code) into a form that is suitable for learning. Traditionally, researchers and practitioners have relied on manually selected features, based on expert knowledge, for the task at hand. Such knowledge is sometimes imprecise and generally incomplete. To overcome this limitation, many studies have leveraged representation learning, delegating to ML itself the job of automatically devising suitable …


Applying Information Theory To Software Evolution, Adriano Torres, Sebastian Baltes, Christoph Treude, Markus Wagner May 2023

Applying Information Theory To Software Evolution, Adriano Torres, Sebastian Baltes, Christoph Treude, Markus Wagner

Research Collection School Of Computing and Information Systems

Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the historical progression of the commit history of 25 open source projects. We produce evidence that they generally are highly correlated. We also observed that they display weak and unstable correlations with other complexity metrics. Our …


She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata May 2023

She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata

Research Collection School Of Computing and Information Systems

Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we …


Toward A Deeper Integration Of Low-Fidelity Sketches Into Mobile Application Development, Soumik Mohian May 2023

Toward A Deeper Integration Of Low-Fidelity Sketches Into Mobile Application Development, Soumik Mohian

Computer Science and Engineering Dissertations

Mobile application development often starts with creating low-fidelity sketches of user interfaces. Integrating these sketches into the software development process can reduce repetition, narrow the gap between user perception and final implementation, and improve app resilience. In this study, we introduce the DoodleUINet dataset, which comprises over 10K sketches of UI elements. Our Doodle2App tool converts low-fidelity sketches into a single-page, compilable Android app. At the same time, our PSDoodle provides an interactive, partial sketch-based search engine with a top-10 screen retrieval accuracy comparable to the state-of-the-art SWIRE line of work but with a 50% reduction in the average required …


Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo May 2023

Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo

Research Collection School Of Computing and Information Systems

Stack Overflow is a popular platform for developers to seek solutions to programming-related problems. However, prior studies identified that developers may suffer from the redundant, useless, and incomplete information retrieved by the Stack Overflow search engine. To help developers better utilize the Stack Overflow knowledge, researchers proposed tools to summarize answers to a Stack Overflow question. However, existing tools use hand-craft features to assess the usefulness of each answer sentence and fail to remove semantically redundant information in the result. Besides, existing tools only focus on a certain programming language and cannot retrieve up-to-date new posted knowledge from Stack Overflow. …


Code Generation Based On Inference And Controlled Natural Language Input, Howard R. Dittmer Apr 2023

Code Generation Based On Inference And Controlled Natural Language Input, Howard R. Dittmer

College of Computing and Digital Media Dissertations

Over time the level of abstraction embodied in programming languages has continued to grow. Paradoxically, most programming languages still require programmers to conform to the language's rigid constructs. These constructs have been implemented in the name of efficiency for the computer. However, the continual increase in computing power allows us to consider techniques not so limited. To this end, we have created CABERNET, a Controlled Natural Language (CNL) based approach to program creation. CABERNET allows programmers to use a simple outline-based syntax. This syntax enables increased programmer efficiency.

CNLs have previously been used to document requirements. We have taken this …


Conversations With Chatgpt About C Programming: An Ongoing Study, James C. Davis, Yung-Hsiang Lu, George K. Thiruvathukal Mar 2023

Conversations With Chatgpt About C Programming: An Ongoing Study, James C. Davis, Yung-Hsiang Lu, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

AI (Artificial Intelligence) Generative Models have attracted great attention in recent years. Generative models can be used to create new articles, visual arts, music composition, even computer programs from English specifications. Among all generative models, ChatGPT is becoming one of the most well-known since its public announcement in November 2022. GPT means {\it Generative Pre-trained Transformer}. ChatGPT is an online program that can interact with human users in text formats and is able to answer questions in many topics, including computer programming. Many computer programmers, including students and professionals, are considering the use of ChatGPT as an aid. The quality …


Csc 71010/Csci 77100: Programming Languages/Software Engineering, Raffi T. Khatchadourian Jan 2023

Csc 71010/Csci 77100: Programming Languages/Software Engineering, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Introduction, Raffi T. Khatchadourian Jan 2023

Introduction, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Reengineering And Refactoring, Raffi T. Khatchadourian Jan 2023

Reengineering And Refactoring, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Wala Quick Start, Raffi T. Khatchadourian Jan 2023

Wala Quick Start, Raffi T. Khatchadourian

Open Educational Resources

Setting up and trying the TJ Watson Library for Analysis (WALA).


Building An Ast Eclipse Plug-In, Raffi T. Khatchadourian Jan 2023

Building An Ast Eclipse Plug-In, Raffi T. Khatchadourian

Open Educational Resources

Complete the Building an AST Eclipse Plug-in assignment. Once it works, find a medium-sized open-source Java project to run your plugin on. You may want to explore GitHub. Import the project into Eclipse and run your plug-in on it. Report on the following, which may require you to change some of the source code so that it is convenient:

  1. Project name.
  2. Project URL.
  3. Project description.
  4. The number of classes in the project.
  5. The number of user-defined methods in the project.
  6. For each class, the number of method calls.
  7. Statistics about the method calls:
    1. The total number of method calls …


Working With Control-Flow Graphs, Raffi T. Khatchadourian Jan 2023

Working With Control-Flow Graphs, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing Jan 2023

I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing

Research Collection School Of Computing and Information Systems

Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware …