Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
- Institution
- Publication
- Publication Type
Articles 1 - 17 of 17
Full-Text Articles in Physical Sciences and Mathematics
Choosing A Sophisticated, Robust, And Secure Programming Language, J. Simon Richard
Choosing A Sophisticated, Robust, And Secure Programming Language, J. Simon Richard
The Downtown Review
This paper explores which programming languages maximize the quality and efficiency of software development projects requiring high levels of sophistication, security, and stability. Of the four languages discussed in this paper—C, C++, Java, and Rust—we conclude that Rust is the best for this application.
Supporting Software Engineers With Large Language Model-Based Automation, Ting Zhang
Supporting Software Engineers With Large Language Model-Based Automation, Ting Zhang
Dissertations and Theses Collection (Open Access)
In recent years, software engineering (SE) has witnessed significant growth, leading to the creation and sharing of an abundance of software artifacts such as source code, bug reports, and pull requests. Analyzing these artifacts is crucial for comprehending the sentiments of software developers and automating various SE tasks, ultimately leading to more human-centered automated SE and enhancing software development efficiency. However, the diverse and unstructured nature of software text poses a significant challenge to this analysis. In response, researchers have investigated a variety of approaches, including the utilization of natural language processing techniques. The advent of large language models (LLMs), …
Do Contributing Files Provide Information About Oss Newcomers' Onboarding Barriers?, Felipe Fronchetti, David Shepherd, Igor Wiese, Christoph Treude, Marco Gerosa, Igor Steinmacher
Do Contributing Files Provide Information About Oss Newcomers' Onboarding Barriers?, Felipe Fronchetti, David Shepherd, Igor Wiese, Christoph Treude, Marco Gerosa, Igor Steinmacher
Research Collection School Of Computing and Information Systems
Effectively onboarding newcomers is essential for the success of open source projects. These projects often provide onboarding guidelines in their ‘CONTRIBUTING’ files (e.g., CONTRIBUTING.md on GitHub). These files explain, for example, how to find open tasks, implement solutions, and submit code for review. However, these files often do not follow a standard structure, can be too large, and miss barriers commonly found by newcomers. In this paper, we propose an automated approach to parse these CONTRIBUTING files and assess how they address onboarding barriers. We manually classified a sample of files according to a model of onboarding barriers from the …
Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein
Dexbert: Effective, Task-Agnostic And Fine-Grained Representation Learning Of Android Bytecode, Tiezhu Sun, Kevin Allix, Kisub Kim, Xin Zhou, Dongsun Kim, David Lo, Tegawendé F. Bissyande, Jacques Klein
Research Collection School Of Computing and Information Systems
The automation of an increasingly large number of software engineering tasks is becoming possible thanks to Machine Learning (ML). One foundational building block in the application of ML to software artifacts is the representation of these artifacts ( e.g. , source code or executable code) into a form that is suitable for learning. Traditionally, researchers and practitioners have relied on manually selected features, based on expert knowledge, for the task at hand. Such knowledge is sometimes imprecise and generally incomplete. To overcome this limitation, many studies have leveraged representation learning, delegating to ML itself the job of automatically devising suitable …
Applying Information Theory To Software Evolution, Adriano Torres, Sebastian Baltes, Christoph Treude, Markus Wagner
Applying Information Theory To Software Evolution, Adriano Torres, Sebastian Baltes, Christoph Treude, Markus Wagner
Research Collection School Of Computing and Information Systems
Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the historical progression of the commit history of 25 open source projects. We produce evidence that they generally are highly correlated. We also observed that they display weak and unstable correlations with other complexity metrics. Our …
She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata
She Elicits Requirements And He Tests: Software Engineering Gender Bias In Large Language Models, Christoph Treude, Hideaki Hata
Research Collection School Of Computing and Information Systems
Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we …
Toward A Deeper Integration Of Low-Fidelity Sketches Into Mobile Application Development, Soumik Mohian
Toward A Deeper Integration Of Low-Fidelity Sketches Into Mobile Application Development, Soumik Mohian
Computer Science and Engineering Dissertations
Mobile application development often starts with creating low-fidelity sketches of user interfaces. Integrating these sketches into the software development process can reduce repetition, narrow the gap between user perception and final implementation, and improve app resilience. In this study, we introduce the DoodleUINet dataset, which comprises over 10K sketches of UI elements. Our Doodle2App tool converts low-fidelity sketches into a single-page, compilable Android app. At the same time, our PSDoodle provides an interactive, partial sketch-based search engine with a top-10 screen retrieval accuracy comparable to the state-of-the-art SWIRE line of work but with a 50% reduction in the average required …
Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo
Techsumbot: A Stack Overflow Answer Summarization Tool For Technical Query, Chengran Yang, Bowen Xu, Jiakun Liu, David Lo
Research Collection School Of Computing and Information Systems
Stack Overflow is a popular platform for developers to seek solutions to programming-related problems. However, prior studies identified that developers may suffer from the redundant, useless, and incomplete information retrieved by the Stack Overflow search engine. To help developers better utilize the Stack Overflow knowledge, researchers proposed tools to summarize answers to a Stack Overflow question. However, existing tools use hand-craft features to assess the usefulness of each answer sentence and fail to remove semantically redundant information in the result. Besides, existing tools only focus on a certain programming language and cannot retrieve up-to-date new posted knowledge from Stack Overflow. …
Code Generation Based On Inference And Controlled Natural Language Input, Howard R. Dittmer
Code Generation Based On Inference And Controlled Natural Language Input, Howard R. Dittmer
College of Computing and Digital Media Dissertations
Over time the level of abstraction embodied in programming languages has continued to grow. Paradoxically, most programming languages still require programmers to conform to the language's rigid constructs. These constructs have been implemented in the name of efficiency for the computer. However, the continual increase in computing power allows us to consider techniques not so limited. To this end, we have created CABERNET, a Controlled Natural Language (CNL) based approach to program creation. CABERNET allows programmers to use a simple outline-based syntax. This syntax enables increased programmer efficiency.
CNLs have previously been used to document requirements. We have taken this …
Conversations With Chatgpt About C Programming: An Ongoing Study, James C. Davis, Yung-Hsiang Lu, George K. Thiruvathukal
Conversations With Chatgpt About C Programming: An Ongoing Study, James C. Davis, Yung-Hsiang Lu, George K. Thiruvathukal
Computer Science: Faculty Publications and Other Works
AI (Artificial Intelligence) Generative Models have attracted great attention in recent years. Generative models can be used to create new articles, visual arts, music composition, even computer programs from English specifications. Among all generative models, ChatGPT is becoming one of the most well-known since its public announcement in November 2022. GPT means {\it Generative Pre-trained Transformer}. ChatGPT is an online program that can interact with human users in text formats and is able to answer questions in many topics, including computer programming. Many computer programmers, including students and professionals, are considering the use of ChatGPT as an aid. The quality …
Csc 71010/Csci 77100: Programming Languages/Software Engineering, Raffi T. Khatchadourian
Csc 71010/Csci 77100: Programming Languages/Software Engineering, Raffi T. Khatchadourian
Open Educational Resources
No abstract provided.
Introduction, Raffi T. Khatchadourian
Introduction, Raffi T. Khatchadourian
Open Educational Resources
No abstract provided.
Reengineering And Refactoring, Raffi T. Khatchadourian
Reengineering And Refactoring, Raffi T. Khatchadourian
Open Educational Resources
No abstract provided.
Wala Quick Start, Raffi T. Khatchadourian
Wala Quick Start, Raffi T. Khatchadourian
Open Educational Resources
Setting up and trying the TJ Watson Library for Analysis (WALA).
Building An Ast Eclipse Plug-In, Raffi T. Khatchadourian
Building An Ast Eclipse Plug-In, Raffi T. Khatchadourian
Open Educational Resources
Complete the Building an AST Eclipse Plug-in assignment. Once it works, find a medium-sized open-source Java project to run your plugin on. You may want to explore GitHub. Import the project into Eclipse and run your plug-in on it. Report on the following, which may require you to change some of the source code so that it is convenient:
- Project name.
- Project URL.
- Project description.
- The number of classes in the project.
- The number of user-defined methods in the project.
- For each class, the number of method calls.
- Statistics about the method calls:
- The total number of method calls …
Working With Control-Flow Graphs, Raffi T. Khatchadourian
Working With Control-Flow Graphs, Raffi T. Khatchadourian
Open Educational Resources
No abstract provided.
I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing
I Know What You Are Searching For: Code Snippet Recommendation From Stack Overflow Posts, Zhipeng Gao, Xin Xia, David Lo, John C. Grundy, Xindong Zhang, Zhenchang Xing
Research Collection School Of Computing and Information Systems
Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware …