Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 235

Full-Text Articles in Physical Sciences and Mathematics

Modeling The Neutral Densities Of Sparc Using A Python Version Of Kn1d, Gwendolyn R. Galleher May 2024

Modeling The Neutral Densities Of Sparc Using A Python Version Of Kn1d, Gwendolyn R. Galleher

Undergraduate Honors Theses

Currently, neutral recycling is a crucial contributor to fueling the plasma within tokamaks. However, Commonwealth Fusion System’s SPARC Tokamak is expected to be more opaque to neutrals. Thus, we anticipate that the role of neutral recycling in fueling will decrease. Since SPARC is predicted to have a groundbreaking fusion power gain ratio of Q ≈ 10, we must have a concrete understanding of the opacity
and whether or not alternative fueling practices must be included. To develop said understanding, we produced neutral density profiles via KN1DPy, a 1D kinetic neutral transport code for atomic and molecular hydrogen in an ionizing …


Security And Interpretability In Large Language Models, Lydia Danas May 2024

Security And Interpretability In Large Language Models, Lydia Danas

Undergraduate Honors Theses

Large Language Models (LLMs) have the capability to model long-term dependencies in sequences of tokens, and are consequently often utilized to generate text through language modeling. These capabilities are increasingly being used for code generation tasks; however, LLM-powered code generation tools such as GitHub's Copilot have been generating insecure code and thus pose a cybersecurity risk. To generate secure code we must first understand why LLMs are generating insecure code. This non-trivial task can be realized through interpretability methods, which investigate the hidden state of a neural network to explain model outputs. A new interpretability method is rationales, which obtains …


Improving The Scalability Of Neural Network Surface Code Decoders, Kevin Wu May 2024

Improving The Scalability Of Neural Network Surface Code Decoders, Kevin Wu

Undergraduate Honors Theses

Quantum computers have recently gained significant recognition due to their ability to solve problems intractable to classical computers. However, due to difficulties in building actual quantum computers, they have large error rates. Thus, advancements in quantum error correction are urgently needed to improve both their reliability and scalability. Here, we first present a type of topological quantum error correction code called the surface code, and we discuss recent developments and challenges of creating neural network decoders for surface codes. In particular, the amount of training data needed to reach the performance of algorithmic decoders grows exponentially with the size of …


Code Syntax Understanding In Large Language Models, Cole Granger May 2024

Code Syntax Understanding In Large Language Models, Cole Granger

Undergraduate Honors Theses

In recent years, tasks for automated software engineering have been achieved using Large Language Models trained on source code, such as Seq2Seq, LSTM, GPT, T5, BART and BERT. The inherent textual nature of source code allows it to be represented as a sequence of sub-words (or tokens), drawing parallels to prior work in NLP. Although these models have shown promising results according to established metrics (e.g., BLEU, CODEBLEU), there remains a deeper question about the extent of syntax knowledge they truly grasp when trained and fine-tuned for specific tasks.

To address this question, this thesis introduces a taxonomy of syntax …


Evaluating Large Language Model Performance On Haskell, Andrew Chen May 2024

Evaluating Large Language Model Performance On Haskell, Andrew Chen

Undergraduate Honors Theses

I introduce HaskellEval, a Haskell evaluation benchmark for Large Language Models. HaskellEval’s curation leverages a novel synthetic generation framework, streamlining the process of dataset curation by minimizing manual intervention. The core of this research is an extensive analysis of the trustworthiness of synthetic generations, ensuring accuracy, realism, and diversity. Additional, I provide a comprehensive evaluation of existing open-source models on HaskellEval.


Artificial Intelligence For The Electron Ion Collider (Ai4eic), C. Allaire, ..., Cristiano Fanelli, James Giroux, Joey Niestroy, Justin R. Stevens, Patrick Stone, L. Suarez, K. Suresh, Eric Walter, Et Al. Feb 2024

Artificial Intelligence For The Electron Ion Collider (Ai4eic), C. Allaire, ..., Cristiano Fanelli, James Giroux, Joey Niestroy, Justin R. Stevens, Patrick Stone, L. Suarez, K. Suresh, Eric Walter, Et Al.

Arts & Sciences Articles

The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. …


Identification Of Transient Radio Frequency Occlusion Events In Urban Environments, Margaret M. Rooney, Mark Hinders Jul 2023

Identification Of Transient Radio Frequency Occlusion Events In Urban Environments, Margaret M. Rooney, Mark Hinders

Arts & Sciences Articles

We model the propagation of SHF OFDM signals around vehicles and buildings since these are the most common elements present in urban environments that could lead to complex radio frequency signal scattering. Scenarios involving temporary hidden node situations, which we term transient occlusion events, are simulated and compared to scenarios where a line of sight transmission event occurs. Sets of fingerprints generated from signals recorded in full-wave 3D finite difference time domain simulations of these two different types of situations are compared, and features in the fingerprints corresponding to the occlusion of a transmitted signal by a vehicle or a …


Power Profiling Smart Home Devices, Kailai Cui May 2023

Power Profiling Smart Home Devices, Kailai Cui

Undergraduate Honors Theses

In recent years, the growing market for smart home devices has raised concerns about user privacy and security. Previous works have utilized power auditing measures to infer activity of IoT devices to mitigate security and privacy threats.

In this thesis, we explore the potential of extracting information from the power consumption traces of smart home devices. We present a framework that collects smart home devices’ power traces with current sensors and preprocesses them for effective inference. We collect an extensive dataset of duration > 2h from 6 devices including smart speakers, smart camera and smart display. We perform different classification tasks …


Kfactorvae: Self-Supervised Regularization For Better A.I. Disentanglement, Joseph S. Lee May 2023

Kfactorvae: Self-Supervised Regularization For Better A.I. Disentanglement, Joseph S. Lee

Undergraduate Honors Theses

Obtaining disentangled representations is a goal sought after to make A.I. models more interpretable. Studies have proven the impossibility of obtaining these kinds of representations with just unsupervised learning, or in other words, without strong inductive biases. One strong inductive bias is a regularization term that encourages the invariance of factors of variations across an image and a carefully selected augmentation. In this thesis, we build upon the existing Variational Autoencoder (VAE)-based disentanglement literature by utilizing the aforementioned inductive bias. We evaluate our method on the dSprites dataset, a well-known benchmark, and demonstrate its ability to achieve comparable or higher …


Identifying Social Media Users That Are Susceptible To Phishing Attacks, Zoe Metzger May 2023

Identifying Social Media Users That Are Susceptible To Phishing Attacks, Zoe Metzger

Undergraduate Honors Theses

Phishing scams are a billion-dollar problem. According to Threatpost, in 2020, business email compromise phishing attacks cost the US economy $ 1.8 billion. Social media phishing scams are also on the rise with 74% of companies experiencing social media attacks in 2021 according to Proofpoint. Educating users about phishing scams is an effective strategy for reducing phishing attacks. Despite efforts to combat phishing, the number of attacks continues to rise, likely indicative of a reticence of users to change online behaviors. Existing research into predicting vulnerable social media users that are susceptible to phishing mostly focuses on content analysis of …


Appearance Driven Reflectance Modeling, James Christopher Bieron Jan 2023

Appearance Driven Reflectance Modeling, James Christopher Bieron

Dissertations, Theses, and Masters Projects

Creating realistic computer generated imagery is essential for modern movies and video games. Recreating the appearance of materials is integral to generating such photo-realistic images. While the problem of how to model materials is well studied, here we will focus on the question of how to recreate the appearance of specific materials found in the real world. In this dissertation we will begin with a short introduction to rendering, followed by a discussion of various material models, techniques for measuring reflectance, and strategies for fitting these models to reflectance data. We will then introduce a novel two-stage process for fitting, …


Learning-Based Ubiquitous Sensing For Solving Real-World Problems, Woosub Jung Jan 2023

Learning-Based Ubiquitous Sensing For Solving Real-World Problems, Woosub Jung

Dissertations, Theses, and Masters Projects

Recently, as the Internet of Things (IoT) technology has become smaller and cheaper, ubiquitous sensing ability within these devices has become increasingly accessible. Learning methods have also become more complex in the field of computer science ac- cordingly. However, there remains a gap between these learning approaches and many problems in other disciplinary fields. In this dissertation, I investigate four different learning-based studies via ubiquitous sensing for solving real-world problems, such as in IoT security, athletics, and healthcare. First, I designed an online intrusion detection system for IoT devices via power auditing. To realize the real-time system, I created a …


Recoverable Memory Bank For Class-Incremental Learning, Jiangtao Kong Jan 2023

Recoverable Memory Bank For Class-Incremental Learning, Jiangtao Kong

Dissertations, Theses, and Masters Projects

Incremental learning aims to enable machine learning systems to sequentially learn new tasks without forgetting the old ones. While some existing methods, such as data replay-based and parameter isolation-based approaches, achieve remarkable results in incremental learning, they often suffer from memory limits, privacy issues, or generation instability. To address these problems, we propose Recoverable Memory Bank (RMB), a novel non-exemplar-based approach for class incremental learning (CIL). Specifically, we design a dynamic memory bank that stores only one aggregated memory representing each class of the old tasks. Next, we propose a novel method that combines a high-dimensional space rotation matrix and …


A Comprehensive Study Of Bills Of Materials For Software Systems, Trevor Stalnaker Jan 2023

A Comprehensive Study Of Bills Of Materials For Software Systems, Trevor Stalnaker

Dissertations, Theses, and Masters Projects

Software Bills of Materials (SBOMs) have emerged as tools to facilitate the management of software dependencies, vulnerabilities, licenses, and the supply chain. Significant effort has been devoted to increasing SBOM awareness and developing SBOM formats and tools. Despite this effort, recent studies have shown that SBOMs are still an early technology not adequately adopted in practice yet, mainly due to limited SBOM tooling and lack of industry consensus on SBOM content, tool usage, and practical benefits. Expanding on previous research, this paper reports a comprehensive study that first investigates the current challenges stakeholders encounter when creating and using SBOMs. The …


Matfusion: A Generative Diffusion Model For Svbrdf Capture, Samuel Lee Sartor Jan 2023

Matfusion: A Generative Diffusion Model For Svbrdf Capture, Samuel Lee Sartor

Dissertations, Theses, and Masters Projects

We formulate SVBRDF estimation from photographs as a diffusion task. To model the distribution of spatially varying materials, we first train a novel unconditional SVBRDF diffusion backbone model on a large set of 312,165 synthetic spatially varying material exemplars. This SVBRDF diffusion backbone model, named MatFusion, can then serve as a basis for refining a conditional diffusion model to estimate the material properties from a photograph under controlled or uncontrolled lighting. Our backbone MatFusion model is trained using only a loss on the reflectance properties, and therefore refinement can be paired with more expensive rendering methods without the need for …


Efficient Parallelization Of Irregular Applications On Gpu Architectures, Qihan Wang Jan 2023

Efficient Parallelization Of Irregular Applications On Gpu Architectures, Qihan Wang

Dissertations, Theses, and Masters Projects

With the enlarging computation capacity of general Graphics Processing Units (GPUs), leveraging GPUs to accelerate parallel applications has become a critical topic in academia and industry. However, a wide range of irregular applications with a computation-/memory-intensive nature cannot easily achieve high GPU utilization. The challenges mainly involve the following aspects: first, data dependence leads to a coarse-grained kernel; second, heavy GPU memory usage may cause frequent memory evictions and extra overhead of I/O; third, specific computation patterns produce memory redundancies; last, workload balance and data reusability conjunctly benefit the overall performance, but there may exist a dynamic trade-off between them. …


Intelligent Software Tooling For Improving Software Development, Nathan Allen Cooper Jan 2023

Intelligent Software Tooling For Improving Software Development, Nathan Allen Cooper

Dissertations, Theses, and Masters Projects

Software has eaten the world with many of the necessities and quality of life services people use requiring software. Therefore, tools that improve the software development experience can have a significant impact on the world such as generating code and test cases, detecting bugs, question and answering, etc. The success of Deep Learning (DL) over the past decade has shown huge advancements in automation across many domains, including Software Development processes. One of the main reasons behind this success is the availability of large datasets such as open-source code available through GitHub or image datasets of mobile Graphical User Interfaces …


Program Analysis For Software Engineers And Students, Jialiang Tan Jan 2023

Program Analysis For Software Engineers And Students, Jialiang Tan

Dissertations, Theses, and Masters Projects

Software inefficiencies are inevitable in computer systems. At the code level, software packages have become increasingly complex, they are comprised of a large amount of source code, sophisticated control and data flow, and growing levels of abstraction. This complexity often introduces inefficiencies across software stacks, leading to performance degradation. At the resource level, the evolution of hardware outpaces the performance optimization of software, leading to resource wastage and energy dissipation in emerging architecture. To better understand program behaviors, software developers take advantage of performance profiling tools. Existing profiling techniques, whether fine-grained profilers or coarse-grained profilers focus on identifying hotspots, which …


Exploring Software Licensing Issues Faced By Legal Practitioners, Nathan James Wintersgill Jan 2023

Exploring Software Licensing Issues Faced By Legal Practitioners, Nathan James Wintersgill

Dissertations, Theses, and Masters Projects

Most modern software products incorporate open source components, which requires compliance with each component’s licenses. As noncompliance can lead to significant repercussions, organizations often seek advice from legal practitioners to maintain license compliance, address licensing issues, and manage the risks of noncompliance. While legal practitioners play a critical role in the process, little is known in the software engineering community about their experiences within the open source license compliance ecosystem. To fill this knowledge gap, a joint team of software engineering and legal researchers designed and conducted a survey with 30 legal practitioners and related occupations and then held 16 …


A Reevaluation Of Why Crypto-Detectors Fail: A Systematic Revaluation Of Cryptographic Misuse Detection Techniques, Scott Marsden Jan 2023

A Reevaluation Of Why Crypto-Detectors Fail: A Systematic Revaluation Of Cryptographic Misuse Detection Techniques, Scott Marsden

Dissertations, Theses, and Masters Projects

The correct use of cryptography is central to ensuring data security in modern software systems. Hence, several academic and commercial static analysis tools have been developed for detecting and mitigating crypto-API misuse. While developers are optimistically adopting these crypto-API misuse detectors (or crypto-detectors) in their software development cycles, this momentum must be accompanied by a rigorous understanding of their effectiveness at finding crypto-API misuse in practice. The original paper presents the MASC framework, which enables a systematic and data-driven evaluation of crypto-detectors using mutation testing. MASC was grounded in a comprehensive view of the problem space by developing a data-driven …


Domain-Specific Optimization For Machine Learning System, Yu Chen Jan 2023

Domain-Specific Optimization For Machine Learning System, Yu Chen

Dissertations, Theses, and Masters Projects

The machine learning (ML) system has been an indispensable part of the ML ecosystem in recent years. The rapid growth of ML brings new system challenges such as the need of handling more large-scale data and computation, the requirements for higher execution performance, and lower resource usage, stimulating the demand for improving ML system. General-purpose system optimization is widely used but brings limited benefits because ML applications vary in execution behaviors based on their algorithms, input data, and configurations. It's difficult to perform comprehensive ML system optimizations without application specific information. Therefore, domain-specific optimization, a method that optimizes particular types …


Deep Learning Fusion Of Satellite And Social Information To Estimate Human Migratory Flows, Daniel Runfola, Heather Baier, Laura Mills, Maeve Naughton-Rockwell, Anthony Stefanidis Sep 2022

Deep Learning Fusion Of Satellite And Social Information To Estimate Human Migratory Flows, Daniel Runfola, Heather Baier, Laura Mills, Maeve Naughton-Rockwell, Anthony Stefanidis

Arts & Sciences Articles

Human migratory decisions are driven by a wide range of factors, including economic and environmental condi-tions, conflict, and evolving social dynamics. These factors are reflected in disparate data sources, including house-hold surveys, satellite imagery, and even news and social media. Here, we present a deep learning- based data fusion technique integrating satellite and census data to estimate migratory flows from Mexico to the United States. We leverage a three-stage approach, in which we (1) construct a matrix- based representation of socioeconomic information for each municipality in Mexico, (2) implement a convolutional neural network with both satellite imagery and the constructed …


Quantum Federated Learning: Training Hybrid Neural Networks Collaboratively, Anneliese Brei May 2022

Quantum Federated Learning: Training Hybrid Neural Networks Collaboratively, Anneliese Brei

Undergraduate Honors Theses

This thesis explores basic concepts of machine learning, neural networks, federated learning, and quantum computing in an effort to better understand Quantum Machine Learning, an emerging field of research. We propose Quantum Federated Learning (QFL), a schema for collaborative distributed learning that maintains privacy and low communication costs. We demonstrate the QFL framework and local and global update algorithms with implementations that utilize TensorFlow Quantum libraries. Our experiments test the effectiveness of frameworks of different sizes. We also test the effect of changing the number of training cycles and changing distribution of training data. This thesis serves as a synoptic …


Exploring Multi-Level Parallelism For Graph-Based Applications Via Algorithm And System Co-Design, Zhen Peng Jan 2022

Exploring Multi-Level Parallelism For Graph-Based Applications Via Algorithm And System Co-Design, Zhen Peng

Dissertations, Theses, and Masters Projects

Graph processing is at the heart of many modern applications where graphs are used as the basic data structure to represent the entities of interest and the relationships between them. Improving the performance of graph-based applications, especially using parallelism techniques, has drawn significant interest both in academia and industry. On the one hand, modern CPU architectures are able to provide massive computational power by using sophisticated memory hierarchy and multi-level parallelism, including thread-level parallelism, data-level parallelism, etc. On the other hand, graph processing workloads are notoriously challenging for achieving high performance due to their irregular computation pattern and unpredictable control …


Enabling Practical Evaluation Of Privacy Of Commodity-Iot, Sunil Manandhar Jan 2022

Enabling Practical Evaluation Of Privacy Of Commodity-Iot, Sunil Manandhar

Dissertations, Theses, and Masters Projects

There has been a massive shift towards the use of IoT products in recent years. While companies have come a long way in making these devices and services easily accessible to the consumers, very little is known about the privacy issues pertaining to these devices. In this dissertation, we focus on evaluating privacy pertaining to commodity-IoT devices by studying device usage behavior of consumers and privacy disclosure practices of IoT vendors. Our analyses consider deep intricacies tied to commodity-IoT domain, revealing insightful findings that help with building automated tools for a large scale analysis. We first present the design and …


Communication And Computation Efficient Deep Learning, Zeyi Tao Jan 2022

Communication And Computation Efficient Deep Learning, Zeyi Tao

Dissertations, Theses, and Masters Projects

Recent advances in Artificial Intelligence (AI) are characterized by ever-increasing datasets and rapid growth of model complexity. Many modern machine learning models, especially deep neural networks (DNNs), cannot be efficiently carried out by a single machine. Hence, distributed optimization and inference have been widely adopted to tackle large-scale machine learning problems. Meanwhile, quantum computers that process computational tasks exponentially faster than classical machines offer an alternative solution for resource-intensive deep learning. However, there are two obstacles that hinder us from building large-scale DNNs on the distributed systems and quantum computers. First, when distributed systems scale to many nodes, the training …


Techniques For Accelerating Large-Scale Automata Processing, Hongyuan Liu Jan 2022

Techniques For Accelerating Large-Scale Automata Processing, Hongyuan Liu

Dissertations, Theses, and Masters Projects

The big-data era has brought new challenges to computer architectures due to the large-scale computation and data. Moreover, this problem becomes critical in several domains where the computation is also irregular, among which we focus on automata processing in this dissertation. Automata are widely used in applications from different domains such as network intrusion detection, machine learning, and parsing. Large-scale automata processing is challenging for traditional von Neumann architectures. To this end, many accelerator prototypes have been proposed. Micron's Automata Processor (AP) is an example. However, as a spatial architecture, it is unable to handle large automata programs without repeated …


Practical Gpgpu Application Resilience Estimation And Fortification, Lishan Yang Jan 2022

Practical Gpgpu Application Resilience Estimation And Fortification, Lishan Yang

Dissertations, Theses, and Masters Projects

Graphics Processing Units (GPUs) are becoming a de facto solution for accelerating a wide range of applications but remain susceptible to transient hardware faults (soft errors) that can easily compromise application output. One of the major challenges in the domain of GPU reliability is to accurately measure general purpose GPU (GPGPU) application resilience to transient faults. This challenge stems from the fact that a typical GPGPU application spawns a huge number of threads and then utilizes a large amount of potentially unreliable compute and memory resources available on the GPUs. As the number of possible fault locations can be in …


Flexible And Robust Iterative Methods For The Partial Singular Value Decomposition, Steven Goldenberg Jan 2022

Flexible And Robust Iterative Methods For The Partial Singular Value Decomposition, Steven Goldenberg

Dissertations, Theses, and Masters Projects

The Singular Value Decomposition (SVD) is one of the most fundamental matrix factorizations in linear algebra. As a generalization of the eigenvalue decomposition, the SVD is essential for a wide variety of fields including statistics, signal and image processing, chemistry, quantum physics and even weather prediction. The methods for numerically computing the SVD mostly fall under three main categories: direct, iterative, and streaming. Direct methods focus on solving the SVD in its entirety, making them suitable for smaller dense matrices where the computation cost is tractable. On the other end of the spectrum, streaming methods were created to provide an …


Static And Dynamic Analysis In Cryptographic-Api Misuse Detection Of Mobile Application, Kunyang Li Dec 2021

Static And Dynamic Analysis In Cryptographic-Api Misuse Detection Of Mobile Application, Kunyang Li

Undergraduate Honors Theses

With Android devices becoming more advanced and gaining more popularity, the number of cryptographic-API misuses in mobile applications is escalating. Numerous snippets of code in Android are from Stack Overflow and over 90% of them contain several crypto-issues. Various crypto-misuse detectors come out aiming to report vulnerabilities of apps and better secure users’ privacy. These detectors can be broadly classified into two categories based on the analysis strategies employed to catch misuses – static analysis (i.e., by scanning the code base) and dynamic analysis (i.e., by executing the code). However, there are not enough research on comparing their underlying differences, …