Engineering | Open Access Articles | Digital Commons Network™

Accelerating Graphics Rendering On Risc-V Gpus, Joshua Simpson Jun 2022

Accelerating Graphics Rendering On Risc-V Gpus, Joshua Simpson

Master's Theses

Graphics Processing Units (GPUs) are commonly used to accelerate massively parallel workloads across a wide range of applications from machine learning to cryptocurrency mining. The original application for GPUs, however, was to accelerate graphics rendering which remains popular today through video gaming and video rendering. While GPUs began as fixed function hardware with minimal programmability, modern GPUs have adopted a design with many programmable cores and supporting fixed function hardware for rasterization, texture sampling, and render output tasks. This balance enables GPUs to be used for general purpose computing and still remain adept at graphics rendering. Previous work at the …

Go to article

Dynamic Dependency Collapsing, Görkem Aşılıoğlu Jan 2020

Dynamic Dependency Collapsing, Görkem Aşılıoğlu

Dissertations, Master's Theses and Master's Reports

In this dissertation, we explore the concept of dynamic dependency collapsing. Performance increases in computer architecture are always introduced by exploiting additional parallelism when the clock speed is fixed. We show that further improvements are possible even when the available parallelism in programs are exhausted. This performance improvement is possible due to executing instructions in parallel that would ordinarily have been serialized. We call this concept dependency collapsing. We explore existing techniques that exploit parallelism and show which of them fall under the umbrella of dependency collapsing. We then introduce two dependency collapsing techniques of our own. The first technique …

Go to article

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young Aug 2017

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young

Masters Theses

Field-programmable gate arrays (FPGA), application-specific integrated circuits (ASIC), and other chip/multi-chip level implementations can be used to implement Dynamic Adaptive Neural Network Arrays (DANNA). In some applications, DANNA interfaces with a traditional computing system to provide neural network configuration information, provide network input, process network outputs, and monitor the state of the network. The present host-to-DANNA network communication setup uses a Cypress USB 3.0 peripheral controller (FX3) to enable host-to-array communication over USB 3.0. This communications setup has to run commands in batches and does not have enough bandwidth to meet the maximum throughput requirements of the DANNA device, resulting …

Go to article

Physically Equivalent Intelligent Systems For Reasoning Under Uncertainty At Nanoscale, Santosh Khasanvis Nov 2015

Physically Equivalent Intelligent Systems For Reasoning Under Uncertainty At Nanoscale, Santosh Khasanvis

Doctoral Dissertations

Machines today lack the inherent ability to reason and make decisions, or operate in the presence of uncertainty. Machine-learning methods such as Bayesian Networks (BNs) are widely acknowledged for their ability to uncover relationships and generate causal models for complex interactions. However, their massive computational requirement, when implemented on conventional computers, hinders their usefulness in many critical problem areas e.g., genetic basis of diseases, macro finance, text classification, environment monitoring, etc. We propose a new non-von Neumann technology framework purposefully architected across all layers for solving these problems efficiently through physical equivalence, enabled by emerging nanotechnology. The architecture builds …

Go to article

High-Performance, Scalable Optical Network-On-Chip Architectures, Xianfang Tan Aug 2013

High-Performance, Scalable Optical Network-On-Chip Architectures, Xianfang Tan

UNLV Theses, Dissertations, Professional Papers, and Capstones

The rapid advance of technology enables a large number of processing cores to be integrated into a single chip which is called a Chip Multiprocessor (CMP) or a Multiprocessor System-on-Chip (MPSoC) design. The on-chip interconnection network, which is the communication infrastructure for these processing cores, plays a central role in a many-core system. With the continuously increasing complexity of many-core systems, traditional metallic wired electronic networks-on-chip (NoC) became a bottleneck because of the unbearable latency in data transmission and extremely high energy consumption on chip. Optical networks-on-chip (ONoC) has been proposed as a promising alternative paradigm for electronic NoC with …

Go to article

Utilization Of Automated Gcc Optimization For Dual-Width Instruction Sets On The Arm Architecture, Shane Watson Dec 2010

Utilization Of Automated Gcc Optimization For Dual-Width Instruction Sets On The Arm Architecture, Shane Watson

Computer Engineering

One of the most important considerations in embedded systems is code size. This consideration is obviously imposed by external factors such as cost and physical space, but what it boils down to is that we want our devices to be as powerful as they can within a (typically limited) specific form factor. This limits the amount of space we have for memory and as such we should always be considering the code size of our application and making sure it’s as efficient as possible. We also then need to consider other factors such as performance and power consumption. This is …

Go to article

Engineering Commons^™

Full-Text Articles in Engineering

Accelerating Graphics Rendering On Risc-V Gpus, Joshua Simpson

Master's Theses

Dynamic Dependency Collapsing, Görkem Aşılıoğlu

Dissertations, Master's Theses and Master's Reports

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young

Masters Theses

Physically Equivalent Intelligent Systems For Reasoning Under Uncertainty At Nanoscale, Santosh Khasanvis

Doctoral Dissertations

High-Performance, Scalable Optical Network-On-Chip Architectures, Xianfang Tan

UNLV Theses, Dissertations, Professional Papers, and Capstones

Utilization Of Automated Gcc Optimization For Dual-Width Instruction Sets On The Arm Architecture, Shane Watson

Computer Engineering