Systems Architecture | Open Access Articles | Digital Commons Network™

Out-Of-Core Gpu Path Tracing On Large Instanced Scenes Via Geometry Streaming, Jeremy Berchtold Jun 2022

Out-Of-Core Gpu Path Tracing On Large Instanced Scenes Via Geometry Streaming, Jeremy Berchtold

Master's Theses

We present a technique for out-of-core GPU path tracing of arbitrarily large scenes that is compatible with hardware-accelerated ray-tracing. Our technique improves upon previous works by subdividing the scene spatially into streamable chunks that are loaded using a priority system that maximizes ray throughput and minimizes GPU memory usage. This allows for arbitrarily large scaling of scene complexity. Our system required under 19 minutes to render a solid color version of Disney's Moana Island scene (39.3 million instances, 261.1 million unique quads, and 82.4 billion instanced quads at a resolution of 1024x429 and 1024spp on an RTX 5000 (24GB memory …

Go to article

Millipyde: A Cross-Platform Python Framework For Transparent Gpu Acceleration, James B. Asbury Dec 2021

Millipyde: A Cross-Platform Python Framework For Transparent Gpu Acceleration, James B. Asbury

Master's Theses

The prevalence of general-purpose GPU computing continues to grow and tackle a wider variety of problems that benefit from GPU-acceleration. This acceleration often suffers from a high barrier to entry, however, due to the complexity of software tools that closely map to the underlying GPU hardware, the fast-changing landscape of GPU environments, and the fragmentation of tools and languages that only support specific platforms. Because of this, new solutions will continue to be needed to make GPGPU acceleration more accessible to the developers that can benefit from it. AMD’s new cross-platform development ecosystem ROCm provides promise for developing applications and …

Go to article

The Performance Cost Of Security, Lucy R. Bowen Jun 2019

The Performance Cost Of Security, Lucy R. Bowen

Master's Theses

Historically, performance has been the most important feature when optimizing computer hardware. Modern processors are so highly optimized that every cycle of computation time matters. However, this practice of optimizing for performance at all costs has been called into question by new microarchitectural attacks, e.g. Meltdown and Spectre. Microarchitectural attacks exploit the effects of microarchitectural components or optimizations in order to leak data to an attacker. These attacks have caused processor manufacturers to introduce performance impacting mitigations in both software and silicon.

To investigate the performance impact of the various mitigations, a test suite of forty-seven different tests was created. …

Go to article

Compiler Optimization Effects On Register Collisions, Jonathan S. Tan Jun 2018

Compiler Optimization Effects On Register Collisions, Jonathan S. Tan

Master's Theses

We often want a compiler to generate executable code that runs as fast as possible. One consideration toward this goal is to keep values in fast registers to limit the number of slower memory accesses that occur. When there are not enough physical registers available for use, values are ``spilled'' to the runtime stack. The need for spills is discovered during register allocation wherein values in use are mapped to physical registers. One factor in the efficacy of register allocation is the number of values in use at one time (register collisions). Register collision is affected by compiler optimizations that …

Go to article

Coffee: Context Observer For Fast Enthralling Entertainment, Anthony M. Lenz Jun 2014

Coffee: Context Observer For Fast Enthralling Entertainment, Anthony M. Lenz

Master's Theses

Desktops, laptops, smartphones, tablets, and the Kinect, oh my! With so many devices available to the average consumer, the limitations and pitfalls of each interface are becoming more apparent. Swimming in devices, users often have to stop and think about how to interact with each device to accomplish the current tasks at hand. The goal of this thesis is to minimize user cognitive effort in handling multiple devices by creating a context aware hybrid interface. The context aware system will be explored through the hybridization of gesture and touch interfaces using a multi-touch coffee table and the next-generation Microsoft Kinect. …

Go to article

In Perfect Xen, A Performance Study Of The Emerging Xen Scheduler, Ryan Hnarakis Dec 2013

In Perfect Xen, A Performance Study Of The Emerging Xen Scheduler, Ryan Hnarakis

Master's Theses

Fifty percent of Fortune 500 companies trust Xen, an open-source bare-metal hypervisor, to virtualize their websites and mission critical services in the cloud. Providing superior fault tolerance, scalability, and migration, virtualization allows these companies to run several isolated operating systems simultaneously on the same physical server. These isolated operating systems, called virtual machines, require a virtual traffic guard to cooperate with one another. This guard known as the Credit2 scheduler along with the newest Xen hypervisor was recently developed to supersede the older schedulers. Since wasted CPU cycles can be costly, the Credit2 prototype must undergo significant performance validation before …

Go to article

Flexrender: A Distributed Rendering Architecture For Ray Tracing Huge Scenes On Commodity Hardware., Robert Edward Somers Jun 2012

Flexrender: A Distributed Rendering Architecture For Ray Tracing Huge Scenes On Commodity Hardware., Robert Edward Somers

Master's Theses

As the quest for more realistic computer graphics marches steadily on, the demand for rich and detailed imagery is greater than ever. However, the current "sweet spot" in terms of price, power consumption, and performance is in commodity hardware. If we desire to render scenes with tens or hundreds of millions of polygons as cheaply as possible, we need a way of doing so that maximizes the use of the commodity hardware we already have at our disposal.

Techniques such as normal mapping and level of detail have attempted to address the problem by reducing the amount of geometry in …

Go to article

Cuda Web Api Remote Execution Of Cuda Kernels Using Web Services, Massimo J. Becker Jun 2012

Cuda Web Api Remote Execution Of Cuda Kernels Using Web Services, Massimo J. Becker

Master's Theses

Massively parallel programming is an increasingly growing field with the recent introduction of general purpose GPU computing. Modern graphics processors from NVIDIA and AMD have massively parallel architectures that can be used for such applications as 3D rendering, financial analysis, physics simulations, and biomedical analysis. These massively parallel systems are exposed to programmers through in- terfaces such as NVIDIAs CUDA, OpenCL, and Microsofts C++ AMP. These frame- works expose functionality using primarily either C or C++. In order to use these massively parallel frameworks, programs being implemented must be run on machines equipped with massively parallel hardware. These requirements limit …

Go to article

St. Jude Medical: An Object-Oriented Software Architecture For Embedded And Real-Time Medical Devices, Atila Amiri Aug 2010

St. Jude Medical: An Object-Oriented Software Architecture For Embedded And Real-Time Medical Devices, Atila Amiri

Master's Theses

Medical devices used for surgical or therapeutic purposes require a high degree of safety and effectiveness. Software is critical component of many such medical devices. The software architecture of a system defines organizational structure and the runtime characteristic of the application used to control the operation of the system and provides a set of frameworks that are used to develop that. As such, the design of software architecture is a critical element in achieving the intended functionality, performance, and safety requirements of a medical device. This architecture uses object-oriented design techniques, which model the underlying system as a set of …

Go to article

Reducing Cluster Power Consumption By Dynamically Suspending Idle Nodes, Brian Michael Oppenheim Jun 2010

Reducing Cluster Power Consumption By Dynamically Suspending Idle Nodes, Brian Michael Oppenheim

Master's Theses

Close to 1% of the world's electricity is consumed by computer servers. Given that the increased use of electricity raises costs and damages the environment, optimizing the world's computing infrastructure for power consumption is worthwhile. This thesis is one attempt at such an optimization. In particular, I began by building a cluster of 6 Intel Atom based low-power nodes to perform work analogous to data center clusters. Then, I installed a version of Hadoop modified with a novel power management system on the cluster. The power management system uses different algorithms to determine when to turn off idle nodes in …

Go to article

Systems Architecture Commons^™

Full-Text Articles in Systems Architecture

Out-Of-Core Gpu Path Tracing On Large Instanced Scenes Via Geometry Streaming, Jeremy Berchtold

Master's Theses

Millipyde: A Cross-Platform Python Framework For Transparent Gpu Acceleration, James B. Asbury

Master's Theses

The Performance Cost Of Security, Lucy R. Bowen

Master's Theses

Compiler Optimization Effects On Register Collisions, Jonathan S. Tan

Master's Theses

Coffee: Context Observer For Fast Enthralling Entertainment, Anthony M. Lenz

Master's Theses

In Perfect Xen, A Performance Study Of The Emerging Xen Scheduler, Ryan Hnarakis

Master's Theses

Flexrender: A Distributed Rendering Architecture For Ray Tracing Huge Scenes On Commodity Hardware., Robert Edward Somers

Master's Theses

Cuda Web Api Remote Execution Of Cuda Kernels Using Web Services, Massimo J. Becker

Master's Theses

St. Jude Medical: An Object-Oriented Software Architecture For Embedded And Real-Time Medical Devices, Atila Amiri

Master's Theses

Reducing Cluster Power Consumption By Dynamically Suspending Idle Nodes, Brian Michael Oppenheim

Master's Theses