Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering

PDF

Master's Theses

CUDA

Publication Year

Articles 1 - 5 of 5

Full-Text Articles in Entire DC Network

Astro – A Low-Cost, Low-Power Cluster For Cpu-Gpu Hybrid Computing Using The Jetson Tk1, Sean Kai Sheen Jun 2016

Astro – A Low-Cost, Low-Power Cluster For Cpu-Gpu Hybrid Computing Using The Jetson Tk1, Sean Kai Sheen

Master's Theses

With the rising costs of large scale distributed systems many researchers have began looking at utilizing low power architectures for clusters. In this paper, we describe our Astro cluster, which consists of 46 NVIDIA Jetson TK1 nodes each equipped with an ARM Cortex A15 CPU, 192 core Kepler GPU, 2 GB of RAM, and 16 GB of flash storage. The cluster has a number of advantages when compared to conventional clusters including lower power usage, ambient cooling, shared memory between the CPU and GPU, and affordability. The cluster is built using commodity hardware and can be setup for relatively low …


Optimizing Lempel-Ziv Factorization For The Gpu Architecture, Bryan Ching Jun 2014

Optimizing Lempel-Ziv Factorization For The Gpu Architecture, Bryan Ching

Master's Theses

Lossless data compression is used to reduce storage requirements, allowing for the relief of I/O channels and better utilization of bandwidth. The Lempel-Ziv lossless compression algorithms form the basis for many of the most commonly used compression schemes. General purpose computing on graphic processing units (GPGPUs) allows us to take advantage of the massively parallel nature of GPUs for computations other that their original purpose of rendering graphics. Our work targets the use of GPUs for general lossless data compression. Specifically, we developed and ported an algorithm that constructs the Lempel-Ziv factorization directly on the GPU. Our implementation bypasses the …


Paris: A Parallel Rsa-Prime Inspection Tool, Joseph R. White Jun 2013

Paris: A Parallel Rsa-Prime Inspection Tool, Joseph R. White

Master's Theses

Modern-day computer security relies heavily on cryptography as a means to protect the data that we have become increasingly reliant on. As the Internet becomes more ubiquitous, methods of security must be better than ever. Validation tools can be leveraged to help increase our confidence and accountability for methods we employ to secure our systems.

Security validation, however, can be difficult and time-consuming. As our computational ability increases, calculations that were once considered “hard” due to length of computation, can now be done in minutes. We are constantly increasing the size of our keys and attempting to make computations harder …


Cuda Enhanced Filtering In A Pipelined Video Processing Framework, Austin Aaron Dworaczyk Wiltshire Jun 2013

Cuda Enhanced Filtering In A Pipelined Video Processing Framework, Austin Aaron Dworaczyk Wiltshire

Master's Theses

The processing of digital video has long been a significant computational task for modern x86 processors. With every video frame composed of one to three planes, each consisting of a two-dimensional array of pixel data, and a video clip comprising of thousands of such frames, the sheer volume of data is significant. With the introduction of new high definition video formats such as 4K or stereoscopic 3D, the volume of uncompressed frame data is growing ever larger.

Modern CPUs offer performance enhancements for processing digital video through SIMD instructions such as SSE2 or AVX. However, even with these instruction sets, …


Cuda Web Api Remote Execution Of Cuda Kernels Using Web Services, Massimo J. Becker Jun 2012

Cuda Web Api Remote Execution Of Cuda Kernels Using Web Services, Massimo J. Becker

Master's Theses

Massively parallel programming is an increasingly growing field with the recent introduction of general purpose GPU computing. Modern graphics processors from NVIDIA and AMD have massively parallel architectures that can be used for such applications as 3D rendering, financial analysis, physics simulations, and biomedical analysis. These massively parallel systems are exposed to programmers through in- terfaces such as NVIDIAs CUDA, OpenCL, and Microsofts C++ AMP. These frame- works expose functionality using primarily either C or C++. In order to use these massively parallel frameworks, programs being implemented must be run on machines equipped with massively parallel hardware. These requirements limit …