Open Access. Powered by Scholars. Published by Universities.®

Computer and Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Computer and Systems Architecture

Protecting Return Address Integrity For Risc-V Via Pointer Authentication, Yuhe Zhao Mar 2024

Protecting Return Address Integrity For Risc-V Via Pointer Authentication, Yuhe Zhao

Masters Theses

Embedded systems based on lightweight microprocessors are becoming more prevalent in various applications. However, the security of them remains a significant challenge due to the limited resources and exposure to external threats. Especially, some of these devices store sensitive data and control critical devices, making them high-value targets for attackers. Software security is particularly important because attackers can easily access these devices on the internet and obtain control of them by injecting malware.

Return address (RA) hijacking is a common software attack technique used to compromise control flow integrity (CFI) by manipulating memory, such as return-to-libc attacks. Several methods have …


Extracting Dnn Architectures Via Runtime Profiling On Mobile Gpus, Dong Hyub Kim Mar 2024

Extracting Dnn Architectures Via Runtime Profiling On Mobile Gpus, Dong Hyub Kim

Masters Theses

Due to significant investment, research, and development efforts over the past decade, deep neural networks (DNNs) have achieved notable advancements in classification and regression domains. As a result, DNNs are considered valuable intellectual property for artificial intelligence providers. Prior work has demonstrated highly effective model extraction attacks which steal a DNN, dismantling the provider’s business model and paving the way for unethical or malicious activities, such as misuse of personal data, safety risks in critical systems, or spreading misinformation. This thesis explores the feasibility of model extraction attacks on mobile devices using aggregated runtime profiles as a side-channel to leak …


Action : Adaptive Cache Block Migration In Distributed Cache Architectures, Chandra Sekhar Mummidi Oct 2021

Action : Adaptive Cache Block Migration In Distributed Cache Architectures, Chandra Sekhar Mummidi

Masters Theses

Increasing number of cores in chip multiprocessors (CMP) result in increasing traffic to last-level cache (LLC). Without commensurate increase in LLC bandwidth, such traffic cannot be sustained resulting in loss of performance. Further, as the number of cores increases, it is necessary to scale up the LLC size; otherwise, the LLC miss rate will rise, resulting in a loss of performance. Unfortunately, for a unified LLC with uniform cache access time, access latency increases with cache size, resulting in performance loss. Previously, researchers have proposed partitioning the cache into multiple smaller caches interconnected by a communication network which increases aggregate …


Internet Infrastructures For Large Scale Emulation With Efficient Hw/Sw Co-Design, Aiden K. Gula Oct 2021

Internet Infrastructures For Large Scale Emulation With Efficient Hw/Sw Co-Design, Aiden K. Gula

Masters Theses

Connected systems are becoming more ingrained in our daily lives with the advent of cloud computing, the Internet of Things (IoT), and artificial intelligence. As technology progresses, we expect the number of networked systems to rise along with their complexity. As these systems become abstruse, it becomes paramount to understand their interactions and nuances. In particular, Mobile Ad hoc Networks (MANET) and swarm communication systems exhibit added complexity due to a multitude of environmental and physical conditions. Testing these types of systems is challenging and incurs high engineering and deployment costs. In this work, we propose a scalable MANET emulation …


Network Virtualization And Emulation Using Docker, Openvswitch And Mininet-Based Link Emulation, Narendra Prabhu Dec 2020

Network Virtualization And Emulation Using Docker, Openvswitch And Mininet-Based Link Emulation, Narendra Prabhu

Masters Theses

With the advent of virtualization and artificial intelligence, research on networked systems has progressed substantially. As the technology progresses, we expect a boom in not only the systems research but also in the network of systems domain. It is paramount that we understand and develop methodologies to connect and communicate among the plethora of devices and systems that exist today. One such area is mobile ad-hoc and space communication, which further complicates the task of networking due to myriad of environmental and physical conditions. Developing and testing such systems is an important step considering the large investment required to build …


Sundown: Model-Driven Per-Panel Solar Anomaly Detection For Residential Arrays, Menghong Feng Jul 2020

Sundown: Model-Driven Per-Panel Solar Anomaly Detection For Residential Arrays, Menghong Feng

Masters Theses

There has been significant growth in both utility-scale and residential-scale solar installa- tions in recent years, driven by rapid technology improvements and falling prices. Unlike utility-scale solar farms that are professionally managed and maintained, smaller residential- scale installations often lack sensing and instrumentation for performance monitoring and fault detection. As a result, faults may go undetected for long periods of time, resulting in generation and revenue losses for the homeowner. In this thesis, we present SunDown, a sensorless approach designed to detect per-panel faults in residential solar arrays. SunDown does not require any new sensors for its fault detection and …


Analog Computing Using 1t1r Crossbar Arrays, Yunning Li Mar 2018

Analog Computing Using 1t1r Crossbar Arrays, Yunning Li

Masters Theses

Memristor is a novel passive electronic device and a promising candidate for new generation non-volatile memory and analog computing. Analog computing based on memristors has been explored in this study. Due to the lack of commercial electrical testing instruments for those emerging devices and crossbar arrays, we have designed and built testing circuits to implement analog and parallel computing operations. With the setup developed in this study, we have successfully demonstrated image processing functions utilizing large memristor crossbar arrays. We further designed and experimentally demonstrated the first memristor based field programmable analog array (FPAA), which was successfully configured for audio …


Analyzing Spark Performance On Spot Instances, Jiannan Tian Oct 2017

Analyzing Spark Performance On Spot Instances, Jiannan Tian

Masters Theses

Amazon Spot Instances provide inexpensive service for high-performance computing. With spot instances, it is possible to get at most 90% off as discount in costs by bidding spare Amazon Elastic Computer Cloud (Amazon EC2) instances. In exchange for low cost, spot instances bring the reduced reliability onto the computing environment, because this kind of instance could be revoked abruptly by the providers due to supply and demand, and higher-priority customers are first served.

To achieve high performance on instances with compromised reliability, Spark is applied to run jobs. In this thesis, a wide set of spark experiments are conducted to …


Efficient Scaling Of A Web Proxy Cluster, Hao Zhang Oct 2017

Efficient Scaling Of A Web Proxy Cluster, Hao Zhang

Masters Theses

With the continuing growth in network traffic and increasing diversity in web content, web caching, together with various network functions (NFs), has been introduced to enhance security, optimize network performance, and save expenses. In a large enterprise network with more than tens of thousands of users, a single proxy server is not enough to handle a large number of requests and turns to group processing. When multiple web cache proxies are working as a cluster, they talk with each other and share cached objects by using internet cache protocol (ICP). This leads to poor scalability.

This thesis describes the development …


Accelerated Iterative Algorithms With Asynchronous Accumulative Updates On A Heterogeneous Cluster, Sandesh Gubbi Virupaksha Mar 2016

Accelerated Iterative Algorithms With Asynchronous Accumulative Updates On A Heterogeneous Cluster, Sandesh Gubbi Virupaksha

Masters Theses

In recent years with the exponential growth in web-based applications the amount of data generated has increased tremendously. Quick and accurate analysis of this 'big data' is indispensable to make better business decisions and reduce operational cost. The challenges faced by modern day data centers to process big data are multi fold: to keep up the pace of processing with increased data volume and increased data velocity, deal with system scalability and reduce energy costs. Today's data centers employ a variety of distributed computing frameworks running on a cluster of commodity hardware which include general purpose processors to process big …


Processor Temperature And Reliability Estimation Using Activity Counters, Mayank Chhablani Mar 2016

Processor Temperature And Reliability Estimation Using Activity Counters, Mayank Chhablani

Masters Theses

With the advent of technology scaling lifetime reliability is an emerging threat in high-performance and deadline-critical systems. High on-chip thermal gradients accelerates localised thermal elevations (hotspots) which increases the aging rate of the semiconductor devices. As a result, reliable operation of the processors has become a challenging task. Therefore, cost effective schemes for estimating temperature and reliability are crucial. In this work we present a reliability estimation scheme that is based on a light-weight temperature estimation technique that monitors hardware events. Unlike previously pro- posed hardware counter-based approaches, our approach involves a linear-temporal-feedback estimator, taking into account the effects of …


Modifying Instruction Sets In The Gem5 Simulator To Support Fault Tolerant Designs, Chuan Zhang Nov 2015

Modifying Instruction Sets In The Gem5 Simulator To Support Fault Tolerant Designs, Chuan Zhang

Masters Theses

Traditional fault tolerant techniques such as hardware or time redundancy incur high overhead and are inefficient for checking arithmetic operations. Our objective is to study an alternative approach of adding new instructions to check arithmetic operations. These checking instructions either rely on error detecting code or calculate approximate results and consequently, consume much less execution time. To evaluate the effectiveness of such an approach we wish to modify several benchmarks to use checking instructions and run simulation experiments to find out their execution time and memory usage. However, the checking instructions are not included in the instruction set and as …


Energy Efficiency Exploration Of Coarse-Grain Reconfigurable Architecture With Emerging Nonvolatile Memory, Xiaobin Liu Mar 2015

Energy Efficiency Exploration Of Coarse-Grain Reconfigurable Architecture With Emerging Nonvolatile Memory, Xiaobin Liu

Masters Theses

With the rapid growth in consumer electronics, people expect thin, smart and powerful devices, e.g. Google Glass and other wearable devices. However, as portable electronic products become smaller, energy consumption becomes an issue that limits the development of portable systems due to battery lifetime. In general, simply reducing device size cannot fully address the energy issue.

To tackle this problem, we propose an on-chip interconnect infrastructure and pro- gram storage structure for a coarse-grained reconfigurable architecture (CGRA) with emerging non-volatile embedded memory (MRAM). The interconnect is composed of a matrix of time-multiplexed switchboxes which can be dynamically reconfigured with the …


Network-On-Chip Synchronization, Mark Buckler Nov 2014

Network-On-Chip Synchronization, Mark Buckler

Masters Theses

Technology scaling has enabled the number of cores within a System on Chip (SoC) to increase significantly. Globally Asynchronous Locally Synchronous (GALS) systems using Dynamic Voltage and Frequency Scaling (DVFS) operate each of these cores on distinct and dynamic clock domains. The main communication method between these cores is increasingly more likely to be a Network-on-Chip (NoC). Typically, the interfaces between these clock domains experience multi-cycle synchronization latencies due to their use of “brute-force” synchronizers. This dissertation aims to improve the performance of NoCs and thereby SoCs as a whole by reducing this synchronization latency.

First, a survey of NoC …