Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Computer Sciences

Towards Safer Code Reuse: Investigating And Mitigating Security Vulnerabilities And License Violations In Copy-Based Reuse Scenarios, David Reid Dec 2023

Towards Safer Code Reuse: Investigating And Mitigating Security Vulnerabilities And License Violations In Copy-Based Reuse Scenarios, David Reid

Doctoral Dissertations

Background: A key benefit of open source software is the ability to copy code to reuse in other projects. Code reuse provides benefits such as faster development time, lower cost, and improved quality. There are several ways to reuse open source software in new projects including copy-based reuse, library reuse, and the use of package managers. This work specifically looks at copy-based code reuse.

Motivation: Code reuse has many benefits, but also has inherent risks, including security and legal risks. The reused code may contain security vulnerabilities, license violations, or other issues. Security vulnerabilities may persist in projects that copy …


Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li Aug 2023

Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li

Doctoral Dissertations

In the realm of distributed computing, collective operations involve coordinated communication and synchronization among multiple processing units, enabling efficient data exchange and collaboration. Scientific applications, such as simulations, computational fluid dynamics, and scalable deep learning, require complex computations that can be parallelized across multiple nodes in a distributed system. These applications often involve data-dependent communication patterns, where collective operations are critical for achieving high performance in data exchange. Optimizing collective operations for scientific applications and deep learning involves improving the algorithms, communication patterns, and data distribution strategies to minimize communication overhead and maximize computational efficiency.

Within the context of this …


High-Quality Automatic Program Repair, Manish Motwani Oct 2022

High-Quality Automatic Program Repair, Manish Motwani

Doctoral Dissertations

Software developers spend significant time and effort fixing bugs. Automatic program repair promises to significantly reduce bug-fixing costs. Program repair requires: fault localization — identifying program elements that cause the bug, patch generation — identifying modifications to those program elements to attempt to repair the bug, and patch validation — verifying that the modification actually repairs the bug. Most automatic program repair techniques use the developer-written tests for the repair process and produce seemingly good patches for 11–19% of the bugs in real-world software. However, most of these patches are not correct, as they overfit to the developer-written tests and …


Task-Based Runtime Optimizations Towards High Performance Computing Applications, Qinglei Cao Aug 2022

Task-Based Runtime Optimizations Towards High Performance Computing Applications, Qinglei Cao

Doctoral Dissertations

The last decades have witnessed a rapid improvement of computational capabilities in high-performance computing (HPC) platforms thanks to hardware technology scaling. HPC architectures benefit from mainstream advances on the hardware with many-core systems, deep hierarchical memory subsystem, non-uniform memory access, and an ever-increasing gap between computational power and memory bandwidth. This has necessitated continuous adaptations across the software stack to maintain high hardware utilization. In this HPC landscape of potentially million-way parallelism, task-based programming models associated with dynamic runtime systems are becoming more popular, which fosters developers’ productivity at extreme scale by abstracting the underlying hardware complexity.

In this context, …


Tokamak 3d Heat Load Investigations Using An Integrated Simulation Framework, Thomas Looby May 2022

Tokamak 3d Heat Load Investigations Using An Integrated Simulation Framework, Thomas Looby

Doctoral Dissertations

Reactor class nuclear fusion tokamaks will be inherently complex. Thousands of interconnected systems that span orders of magnitude in physical scale must operate cohesively for the machine to function. Because these reactor class tokamaks are all in an early design stage, it is difficult to quantify exactly how each subsystem will act within the context of the greater systems. Therefore, to predict the engineering parameters necessary to design the machine, simulation frameworks that can model individual systems as well as the interfaced systems are necessary. This dissertation outlines a novel framework developed to couple otherwise disparate computational domains together into …


Enhancing Usability And Explainability Of Data Systems, Anna Fariha Oct 2021

Enhancing Usability And Explainability Of Data Systems, Anna Fariha

Doctoral Dissertations

The recent growth of data science expanded its reach to an ever-growing user base of nonexperts, increasing the need for usability, understandability, and explainability in these systems. Enhancing usability makes data systems accessible to people with different skills and backgrounds alike, leading to democratization of data systems. Furthermore, proper understanding of data and data-driven systems is necessary for the users to trust the function of the systems that learn from data. Finally, data systems should be transparent: when a data system behaves unexpectedly or malfunctions, the users deserve proper explanation of what caused the observed incident. Unfortunately, …


Modeling User-Affected Software Properties For Open Source Software Supply Chains, Tapajit Dey Dec 2020

Modeling User-Affected Software Properties For Open Source Software Supply Chains, Tapajit Dey

Doctoral Dissertations

Background: Open Source Software development community relies heavily on users of the software and contributors outside of the core developers to produce top-quality software and provide long-term support. However, the relationship between a software and its contributors in terms of exactly how they are related through dependencies and how the users of a software affect many of its properties are not very well understood.

Aim: My research covers a number of aspects related to answering the overarching question of modeling the software properties affected by users and the supply chain structure of software ecosystems, viz. 1) Understanding how software usage …


Using Applications To Guide Data Management For Emerging Memory Technologies, Timothy C. Effler Aug 2020

Using Applications To Guide Data Management For Emerging Memory Technologies, Timothy C. Effler

Doctoral Dissertations

A number of promising new memory technologies, such as non-volatile, storage-class memories and high-bandwidth, on-chip RAMs, are emerging. Since each of these new technologies present tradeoffs distinct from conventional DRAMs, many high performance and scientific computing systems have begun to include multiple tiers of memory storage, each with their own type of devices. To efficiently utilize the available hardware, such systems will need to alter their data management strategies to consider the performance and capabilities provided by each tier. This work aims to understand and increase the effectiveness of application data management for emerging complex memory systems. A key realization …


Automatic Derivation Of Requirements For Components Used In Human-Intensive Systems, Heather Conboy Jul 2017

Automatic Derivation Of Requirements For Components Used In Human-Intensive Systems, Heather Conboy

Doctoral Dissertations

Human-intensive systems (HISs), where humans must coordinate with each other along with software and/or hardware components to achieve system missions, are increasingly prevalent in safety-critical domains (e.g., healthcare). Such systems are often complex, involving aspects such as concurrency and exceptional situations. For these systems, it is often difficult but important to determine requirements for the individual components that are necessary to ensure the system requirements are satisfied. In this thesis, we investigated an approach that employs interface synthesis methods developed for software systems to automatically derive such requirements for components used in HISs. In previous work, we investigated a requirement …


Programming Models' Support For Heterogeneous Architecture, Wei Wu May 2017

Programming Models' Support For Heterogeneous Architecture, Wei Wu

Doctoral Dissertations

Accelerator-enhanced computing platforms have drawn a lot of attention due to their massive peak computational capacity. Heterogeneous systems equipped with accelerators such as GPUs have become the most prominent components of High Performance Computing (HPC) systems. Even at the node level the significant heterogeneity of CPU and GPU, i.e. hardware and memory space differences, leads to challenges for fully exploiting such complex architectures. Extending outside the node scope, only escalate such challenges.

Conventional programming models such as data- ow and message passing have been widely adopted in HPC communities. When moving towards heterogeneous systems, the lack of GPU integration causes …


Extensions Of Task-Based Runtime For High Performance Dense Linear Algebra Applications, Chongxiao Cao May 2017

Extensions Of Task-Based Runtime For High Performance Dense Linear Algebra Applications, Chongxiao Cao

Doctoral Dissertations

On the road to exascale computing, the gap between hardware peak performance and application performance is increasing as system scale, chip density and inherent complexity of modern supercomputers are expanding. Even if we put aside the difficulty to express algorithmic parallelism and to efficiently execute applications at large scale, other open questions remain. The ever-growing scale of modern supercomputers induces a fast decline of the Mean Time To Failure. A generic, low-overhead, resilient extension becomes a desired aptitude for any programming paradigm. This dissertation addresses these two critical issues, designing an efficient unified linear algebra development environment using a task-based …


Specification And Analysis Of Resource Utilization Policies For Human-Intensive Systems, Seung Yeob Shin Nov 2016

Specification And Analysis Of Resource Utilization Policies For Human-Intensive Systems, Seung Yeob Shin

Doctoral Dissertations

Contemporary systems often require the effective support of many types of resources, each governed by complex utilization policies. Sound management of these resources plays a key role in assuring that these systems achieve their key goals. To help system developers make sound resource management decisions, I provide a resource utilization policy specification and analysis framework for (1) specifying very diverse kinds of resources and their potentially complex resource utilization policies, (2) dynamically evaluating the policies’ effects on the outcomes achieved by systems utilizing the resources, and (3) formally verifying various kinds of properties of these systems. Resource utilization policies range …


Combining Static And Dynamic Analysis For Bug Detection And Program Understanding, Kaituo Li Nov 2016

Combining Static And Dynamic Analysis For Bug Detection And Program Understanding, Kaituo Li

Doctoral Dissertations

This work proposes new combinations of static and dynamic analysis for bug detection and program understanding. There are 3 related but largely independent directions: a) In the area of dynamic invariant inference, we improve the consistency of dynamically discovered invariants by taking into account second-order constraints that encode knowledge about
invariants; the second-order constraints are either supplied by the programmer or vetted by the programmer (among candidate constraints suggested automatically); b) In the area of testing dataflow (esp. map-reduce) programs, our tool, SEDGE, achieves higher testing coverage by leveraging existing
input data and generalizing them using a symbolic reasoning engine …


Automated Style Feedback For Advanced Beginner Java Programmers, Hannah Blau Nov 2015

Automated Style Feedback For Advanced Beginner Java Programmers, Hannah Blau

Doctoral Dissertations

FrenchPress is an Eclipse plug-in that partially automates the task of giving students feedback on their Java programs. It is designed not for novices but for students taking their second or third Java course: students who know enough Java to write a working program but lack the judgment to recognize bad code when they see it. FrenchPress does not diagnose compile-time or run-time errors, or logical errors that produce incorrect output. It targets silent flaws, flaws the student is unable to identify for himself because nothing in the programming environment alerts him. FrenchPress diagnoses flaws characteristic of programmers who have …


Variation In Human-Intensive Systems: A Conceptual Framework For Characterizing, Modeling, And Analyzing Families Of Systems, Borislava I. Simidchieva Aug 2015

Variation In Human-Intensive Systems: A Conceptual Framework For Characterizing, Modeling, And Analyzing Families Of Systems, Borislava I. Simidchieva

Doctoral Dissertations

A system model---namely a formal definition of the coordination of people, hardware devices, and software components performing activities, using resources and artifacts, and producing various outputs---can aid understanding of the real-world system it models. Complex real-world systems, however, exhibit considerable amounts of variation that can be difficult or impossible to represent within a single model. This dissertation evaluates the hypothesis that the careful characterization and representation of system variation can aid in the generation and analysis of concrete system instances related to one another in specified ways and manifesting different kinds of variation. When a set of closely related systems …


Model-Based Guidance For Human-Intensive Processes, Stefan Christov Mar 2015

Model-Based Guidance For Human-Intensive Processes, Stefan Christov

Doctoral Dissertations

Human-intensive processes (HIPs), such as medical processes involving coordination among doctors, nurses, and other medical staff, often play a critical role in society. Despite considerable work and progress in error reduction, human errors are still a major concern for many HIPs. To address this problem of human errors in HIPs, this thesis investigates two approaches for online process guidance, i.e., for guiding process performers while a process is being executed. Both approaches rely on monitoring a process execution and base the guidance they provide on a detailed formal process model that captures the recommended ways to perform the corresponding HIP. …


Defining, Evaluating, And Improving The Process Of Verifying Patient Identifiers, Junghee Jo Nov 2014

Defining, Evaluating, And Improving The Process Of Verifying Patient Identifiers, Junghee Jo

Doctoral Dissertations

Patient identification errors are a major cause of medication errors. During medication administration, failure to identify patients correctly can lead to patients receiving incorrect medications, perhaps resulting in adverse drug events and even death. Most medication error studies to date have focused on reporting patient misidentification statistics from case studies, on classifying types of patient identification errors, or on evaluating the impact of technology on the patient identification process, but few have proposed specific strategies or guidelines to decrease patient identification errors. This thesis aims to improve the verification of patient identifiers (VPI) process by making three key contributions to …


Subtyping With Generics: A Unified Approach, John G. Altidor Nov 2014

Subtyping With Generics: A Unified Approach, John G. Altidor

Doctoral Dissertations

Reusable software increases programmers' productivity and reduces repetitive code and software bugs. Variance is a key programming language mechanism for writing reusable software. Variance is concerned with the interplay of parametric polymorphism (i.e., templates, generics) and subtype (inclusion) polymorphism. Parametric polymorphism enables programmers to write abstract types and is known to enhance the readability, maintainability, and reliability of programs. Subtyping promotes software reuse by allowing code to be applied to a larger set of terms. Integrating parametric and subtype polymorphism while maintaining type safety is a difficult problem. Existing variance mechanisms enable greater subtyping between parametric types, but they suffer …


Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan Dec 2012

Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan

Doctoral Dissertations

Multicore architectures with high core counts have come to dominate the world of high performance computing, from shared memory machines to the largest distributed memory clusters. The multicore route to increased performance has a simpler design and better power efficiency than the traditional approach of increasing processor frequencies. But, standard programming techniques are not well adapted to this change in computer architecture design.

In this work, we study the use of dynamic runtime environments executing data driven applications as a solution to programming multicore architectures. The goals of our runtime environments are productivity, scalability and performance. We demonstrate productivity by …


The Maximum Clique Problem: Algorithms, Applications, And Implementations, John David Eblen Aug 2010

The Maximum Clique Problem: Algorithms, Applications, And Implementations, John David Eblen

Doctoral Dissertations

Computationally hard problems are routinely encountered during the course of solving practical problems. This is commonly dealt with by settling for less than optimal solutions, through the use of heuristics or approximation algorithms. This dissertation examines the alternate possibility of solving such problems exactly, through a detailed study of one particular problem, the maximum clique problem. It discusses algorithms, implementations, and the application of maximum clique results to real-world problems. First, the theoretical roots of the algorithmic method employed are discussed. Then a practical approach is described, which separates out important algorithmic decisions so that the algorithm can be easily …