Computer Engineering

A Federation Of Sentries: Secure And Efficient Trusted Hardware Element Communication, Blake A. Ward Jun 2024

A Federation Of Sentries: Secure And Efficient Trusted Hardware Element Communication, Blake A. Ward

Master's Theses

Previous work introduced TrustGuard, a design for a containment architecture that allows only the result of the correct execution of approved software to be outputted. A containment architecture prevents results from malicious hardware or software from being communicated externally. At the core of TrustGuard is a trusted, pluggable device that sits on the path between an untrusted processor and the outside world. This device, called the Sentry, is responsible for validating the correctness of all communication before it leaves the system. This thesis seeks to leverage the correctness guarantees that the Sentry provides to enable efficient secure communication between two …

Go to article

Accelerating Graphics Rendering On Risc-V Gpus, Joshua Simpson Jun 2022

Accelerating Graphics Rendering On Risc-V Gpus, Joshua Simpson

Master's Theses

Graphics Processing Units (GPUs) are commonly used to accelerate massively parallel workloads across a wide range of applications from machine learning to cryptocurrency mining. The original application for GPUs, however, was to accelerate graphics rendering which remains popular today through video gaming and video rendering. While GPUs began as fixed function hardware with minimal programmability, modern GPUs have adopted a design with many programmable cores and supporting fixed function hardware for rasterization, texture sampling, and render output tasks. This balance enables GPUs to be used for general purpose computing and still remain adept at graphics rendering. Previous work at the …

Go to article

Hardware For Quantized Mixed-Precision Deep Neural Networks, Andres Rios Aug 2021

Hardware For Quantized Mixed-Precision Deep Neural Networks, Andres Rios

Open Access Theses & Dissertations

Recently, there has been a push to perform deep learning (DL) computations on the edge rather than the cloud due to latency, network connectivity, energy consumption, and privacy issues. However, state-of-the-art deep neural networks (DNNs) require vast amounts of computational power, data, and energyÃ¢??resources that are limited on edge devices. This limitation has brought the need to design domain-specific architectures (DSAs) that implement DL-specific hardware optimizations. Traditionally DNNs have run on 32-bit floating-point numbers; however, a body of research has shown that DNNs are surprisingly robust and do not require all 32 bits. Instead, using quantization, networks can run on …

Go to article

Energy Efficient Computing Using Scalable General Purpose Analog Processors, Ethan Paul Palisoc De Guzman Jun 2021

Energy Efficient Computing Using Scalable General Purpose Analog Processors, Ethan Paul Palisoc De Guzman

Master's Theses

Due to fundamental physical limitations, conventional digital circuits have not been able to scale at the pace expected from Moore’s law. In addition, computationally intensive applications such as neural networks and computer vision demand large amounts of energy from digital circuits. As a result, energy efficient alternatives are needed in order to provide continued performance scaling. Analog circuits have many well known benefits: the ability to store more information onto a single wire and efficiently perform mathematical operations such as addition, subtraction, and differential equation solving. However, analog computing also comes with drawbacks such as its sensitivity to process variation …

Go to article

Dynamic Dependency Collapsing, Görkem Aşılıoğlu Jan 2020

Dynamic Dependency Collapsing, Görkem Aşılıoğlu

Dissertations, Master's Theses and Master's Reports

In this dissertation, we explore the concept of dynamic dependency collapsing. Performance increases in computer architecture are always introduced by exploiting additional parallelism when the clock speed is fixed. We show that further improvements are possible even when the available parallelism in programs are exhausted. This performance improvement is possible due to executing instructions in parallel that would ordinarily have been serialized. We call this concept dependency collapsing. We explore existing techniques that exploit parallelism and show which of them fall under the umbrella of dependency collapsing. We then introduce two dependency collapsing techniques of our own. The first technique …

Go to article

Dedicated Hardware For Machine/Deep Learning: Domain Specific Architectures, Angel Izael Solis Jan 2019

Dedicated Hardware For Machine/Deep Learning: Domain Specific Architectures, Angel Izael Solis

Open Access Theses & Dissertations

Artificial intelligence has come a very long way from being a mere spectacle on the silver screen in the 1920s [Hml18]. As artificial intelligence continues to evolve, and we begin to develop more sophisticated Artificial Neural Networks, the need for specialized and more efficient machines (less computational strain while maintaining the same performance results) becomes increasingly evident. Though these new techniques, such as Multilayer Perceptrons, Convolutional Neural Networks and Recurrent Neural Networks, may seem as if they are on the cutting edge of technology, many of these ideas are over 60 years old! However, many of these earlier models, at …

Go to article

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young Aug 2017

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young

Masters Theses

Field-programmable gate arrays (FPGA), application-specific integrated circuits (ASIC), and other chip/multi-chip level implementations can be used to implement Dynamic Adaptive Neural Network Arrays (DANNA). In some applications, DANNA interfaces with a traditional computing system to provide neural network configuration information, provide network input, process network outputs, and monitor the state of the network. The present host-to-DANNA network communication setup uses a Cypress USB 3.0 peripheral controller (FX3) to enable host-to-array communication over USB 3.0. This communications setup has to run commands in batches and does not have enough bandwidth to meet the maximum throughput requirements of the DANNA device, resulting …

Go to article

Survey Of Branch Prediction, Pipelining, Memory Systems As Related To Computer Architecture, Kristina Landen Apr 2017

Survey Of Branch Prediction, Pipelining, Memory Systems As Related To Computer Architecture, Kristina Landen

Student Works

This paper is a survey of topics introduced in Computer Engineering Course CEC470: Computer Architecture (CEC470). The topics covered in this paper provide much more depth than what was provided in CEC470, in addition to exploring new concepts not touched on in the course. Topics presented include branch prediction, pipelining, registers, memory, and the operating system, as well as some general design considerations for computer architecture as a whole.

The design considerations explored include a discussion on different types of instruction types specific to the ARM Instruction Set Architecture, known as ARM and Thumb, as well as an exploration of …

Go to article

“My Logic Is Undeniable”: Replicating The Brain For Ideal Artificial Intelligence, Samuel C. Adams Apr 2016

“My Logic Is Undeniable”: Replicating The Brain For Ideal Artificial Intelligence, Samuel C. Adams

Senior Honors Theses

Alan Turing asked if machines can think, but intelligence is more than logic and reason. I ask if a machine can feel pain or joy, have visions and dreams, or paint a masterpiece. The human brain sets the bar high, and despite our progress, artificial intelligence has a long way to go. Studying neurology from a software engineer’s perspective reveals numerous uncanny similarities between the functionality of the brain and that of a computer. If the brain is a biological computer, then it is the embodiment of artificial intelligence beyond anything we have yet achieved, and its architecture is advanced …

Go to article

Introduction To Mips Assembly Language Programming, Charles W. Kann Jan 2015

Introduction To Mips Assembly Language Programming, Charles W. Kann

Open Educational Resources

This book was written to introduce students to assembly language programming in MIPS. As with all assembly language programming texts, it covers basic operators and instructions, subprogram calling, loading and storing memory, program control, and the conversion of the assembly language program into machine code.

However this book was not written simply as a book on assembly language programming. The larger purpose of this text is to show how concepts in Higher Level Languages (HLL), such as Java or C/C++, are represented in assembly. By showing how program constructs from these HLL map into assembly, the concepts will be easier …

Go to article

High-Performance, Scalable Optical Network-On-Chip Architectures, Xianfang Tan Aug 2013

High-Performance, Scalable Optical Network-On-Chip Architectures, Xianfang Tan

UNLV Theses, Dissertations, Professional Papers, and Capstones

The rapid advance of technology enables a large number of processing cores to be integrated into a single chip which is called a Chip Multiprocessor (CMP) or a Multiprocessor System-on-Chip (MPSoC) design. The on-chip interconnection network, which is the communication infrastructure for these processing cores, plays a central role in a many-core system. With the continuously increasing complexity of many-core systems, traditional metallic wired electronic networks-on-chip (NoC) became a bottleneck because of the unbearable latency in data transmission and extremely high energy consumption on chip. Optical networks-on-chip (ONoC) has been proposed as a promising alternative paradigm for electronic NoC with …

Go to article

Utilization Of Automated Gcc Optimization For Dual-Width Instruction Sets On The Arm Architecture, Shane Watson Dec 2010

Utilization Of Automated Gcc Optimization For Dual-Width Instruction Sets On The Arm Architecture, Shane Watson

One of the most important considerations in embedded systems is code size. This consideration is obviously imposed by external factors such as cost and physical space, but what it boils down to is that we want our devices to be as powerful as they can within a (typically limited) specific form factor. This limits the amount of space we have for memory and as such we should always be considering the code size of our application and making sure it’s as efficient as possible. We also then need to consider other factors such as performance and power consumption. This is …

Go to article

Improved Framework For Fast And Efficient Memory-Based Frame Data Reconfiguration For Multi-Row Spanning Designs On Field Programmable Gate Arrays, Rohan Sreeram May 2010

Improved Framework For Fast And Efficient Memory-Based Frame Data Reconfiguration For Multi-Row Spanning Designs On Field Programmable Gate Arrays, Rohan Sreeram

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Reconfigurable computing is an evolving paradigm in computer architecture where the ability to load different designs onto a field programmable gate array (FPGA) at execution time has proven useful in adapting FPGA prototypes to a wide range of applications. Reconfiguration techniques can be primarily categorized as Partial Dynamic Reconfiguration (PDR) and Partial Bitstream Relocation (PBR). PDR involves reconfiguring a single Partial Reconfiguration Region (PRR) with a partial bitstream, while PBR is targeted at reconfiguring multiple PRRs on the FPGA with a partial bitstream. Previous techniques have primarily focused on using either slower off-chip memory or on-chip memory-based solutions to store …

Go to article

Computer Engineering Commons^™

Full-Text Articles in Computer Engineering

A Federation Of Sentries: Secure And Efficient Trusted Hardware Element Communication, Blake A. Ward

Master's Theses

Accelerating Graphics Rendering On Risc-V Gpus, Joshua Simpson

Master's Theses

Hardware For Quantized Mixed-Precision Deep Neural Networks, Andres Rios

Open Access Theses & Dissertations

Energy Efficient Computing Using Scalable General Purpose Analog Processors, Ethan Paul Palisoc De Guzman

Master's Theses

Dynamic Dependency Collapsing, Görkem Aşılıoğlu

Dissertations, Master's Theses and Master's Reports

Dedicated Hardware For Machine/Deep Learning: Domain Specific Architectures, Angel Izael Solis

Open Access Theses & Dissertations

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young

Masters Theses

Survey Of Branch Prediction, Pipelining, Memory Systems As Related To Computer Architecture, Kristina Landen

Student Works

“My Logic Is Undeniable”: Replicating The Brain For Ideal Artificial Intelligence, Samuel C. Adams

Senior Honors Theses

Introduction To Mips Assembly Language Programming, Charles W. Kann

Open Educational Resources

High-Performance, Scalable Optical Network-On-Chip Architectures, Xianfang Tan

UNLV Theses, Dissertations, Professional Papers, and Capstones

Utilization Of Automated Gcc Optimization For Dual-Width Instruction Sets On The Arm Architecture, Shane Watson

Computer Engineering

Improved Framework For Fast And Efficient Memory-Based Frame Data Reconfiguration For Multi-Row Spanning Designs On Field Programmable Gate Arrays, Rohan Sreeram

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023