Systems Architecture | Open Access Articles | Digital Commons Network™

Exploring High Performance And Energy Efficient Graph Processing On Gpu, Robert P. Watling Jan 2023

Exploring High Performance And Energy Efficient Graph Processing On Gpu, Robert P. Watling

Dissertations, Master's Theses and Master's Reports

Parallel graph processing is central to analytical computer science applications, and GPUs have proven to be an ideal platform for parallel graph processing. Existing GPU graph processing frameworks present performance improvements but often neglect two issues: the unpredictability of a given input graph and the energy consumption of the graph processing. Our prototype software, EEGraph (Energy Efficiency of Graph processing), is a flexible system consisting of several graph processing algorithms with configurable parameters for vertex update synchronization, vertex activation, and memory management along with a lightweight software-based GPU energy measurement scheme. We observe relationships between different configurations of our software, …

Go to article

Virtual Machine Introspection Tool Design Analysis, Justin Martin Jan 2022

Virtual Machine Introspection Tool Design Analysis, Justin Martin

Dissertations, Master's Theses and Master's Reports

Virtual machines are an integral part of today’s computing world. Their use is widespread and applicable in many different computing fields. With virtual machines, the ability to introspect and monitor is often overlooked or left unimplemented. Introspection is used to gather information about the state of virtual machines as they operate. Without introspection, verbose log data and state information is unavailable after unexpected errors or crashes occur. With introspection, this data can be analyzed further to determine the true cause of the unexpected crash or error. Therefore, introspection plays a critical role in portraying accurate historical information regarding the operating …

Go to article

Poor Man’S Trace Cache: A Variable Delay Slot Architecture, Tino C. Moore Jan 2022

Poor Man’S Trace Cache: A Variable Delay Slot Architecture, Tino C. Moore

Dissertations, Master's Theses and Master's Reports

We introduce a novel fetch architecture called Poor Man’s Trace Cache (PMTC). PMTC constructs taken-path instruction traces via instruction replication in static code and inserts them after unconditional direct and select conditional direct control transfer instructions. These traces extend to the end of the cache line. Since available space for trace insertion may vary by the position of the control transfer instruction within the line, we refer to these fetch slots as variable delay slots. This approach ensures traces are fetched along with the control transfer instruction that initiated the trace. Branch, jump and return instruction semantics as well as …

Go to article

Efficient Modeling Of Random Sampling-Based Lru Cache, Junyao Yang Jan 2021

Efficient Modeling Of Random Sampling-Based Lru Cache, Junyao Yang

Dissertations, Master's Theses and Master's Reports

The Miss Ratio Curve (MRC) is an important metric and effective tool for caching system performance prediction and optimization. Since the Least Recently Used (LRU) replacement policy is the de facto policy for many existing caching systems, most previous studies on efficient MRC construction are predominantly focused on the LRU replacement policy. Recently, the random sampling-based replacement mechanism, as opposed to replacement relying on the rigid LRU data structure, gains more popularity due to its lightweight and flexibility. To approximate LRU, at replacement times, the system randomly selects K objects and replaces the least recently used object among the sample. …

Go to article

Demand-Driven Execution Using Future Gated Single Assignment Form, Omkar Javeri Jan 2020

Demand-Driven Execution Using Future Gated Single Assignment Form, Omkar Javeri

Dissertations, Master's Theses and Master's Reports

This dissertation discusses a novel, previously unexplored execution model called Demand-Driven Execution (DDE), which executes programs starting from the outputs of the program, progressing towards the inputs of the program. This approach is significantly different from prior demand-driven reduction machines as it can execute a program written in an imperative language using the demand-driven paradigm while extracting both instruction and data level parallelism. The execution model relies on an executable Single Assignment Form which serves both as the internal representation of the compiler as well as the Instruction Set Architecture (ISA) of the machine. This work develops the instruction set …

Go to article

Contextual Bandit Modeling For Dynamic Runtime Control In Computer Systems, Jason Hiebel Jan 2019

Contextual Bandit Modeling For Dynamic Runtime Control In Computer Systems, Jason Hiebel

Dissertations, Master's Theses and Master's Reports

Modern operating systems and microarchitectures provide a myriad of mechanisms for monitoring and affecting system operation and resource utilization at runtime. Dynamic runtime control of these mechanisms can tailor system operation to the characteristics and behavior of the current workload, resulting in improved performance. However, developing effective models for system control can be challenging. Existing methods often require extensive manual effort, computation time, and domain knowledge to identify relevant low-level performance metrics, relate low-level performance metrics and high-level control decisions to workload performance, and to evaluate the resulting control models.

This dissertation develops a general framework, based on the contextual …

Go to article

Modeling Data Center Co-Tenancy Performance Interference, Wei Kuang Jan 2018

Modeling Data Center Co-Tenancy Performance Interference, Wei Kuang

Dissertations, Master's Theses and Master's Reports

A multi-core machine allows executing several applications simultaneously. Those jobs are scheduled on different cores and compete for shared resources such as the last level cache and memory bandwidth. Such competitions might cause performance degradation. Data centers often utilize virtualization to provide a certain level of performance isolation. However, some of the shared resources cannot be divided, even in a virtualized system, to ensure complete isolation. If the performance degradation of co-tenancy is not known to the cloud administrator, a data center often has to dedicate a whole machine for a latency-sensitive application to guarantee its quality of service. Co-run …

Go to article

Systems Architecture Commons^™

Full-Text Articles in Systems Architecture

Exploring High Performance And Energy Efficient Graph Processing On Gpu, Robert P. Watling

Dissertations, Master's Theses and Master's Reports

Virtual Machine Introspection Tool Design Analysis, Justin Martin

Dissertations, Master's Theses and Master's Reports

Poor Man’S Trace Cache: A Variable Delay Slot Architecture, Tino C. Moore

Dissertations, Master's Theses and Master's Reports

Efficient Modeling Of Random Sampling-Based Lru Cache, Junyao Yang

Dissertations, Master's Theses and Master's Reports

Demand-Driven Execution Using Future Gated Single Assignment Form, Omkar Javeri

Dissertations, Master's Theses and Master's Reports

Contextual Bandit Modeling For Dynamic Runtime Control In Computer Systems, Jason Hiebel

Dissertations, Master's Theses and Master's Reports

Modeling Data Center Co-Tenancy Performance Interference, Wei Kuang

Dissertations, Master's Theses and Master's Reports