Open Access. Powered by Scholars. Published by Universities.®

Computer and Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 18 of 18

Full-Text Articles in Computer and Systems Architecture

Benchmarking Of Embedded Object Detection In Optical And Radar Scenes, Vijaysrinivas Rajagopal Dec 2022

Benchmarking Of Embedded Object Detection In Optical And Radar Scenes, Vijaysrinivas Rajagopal

Masters Theses

A portable, real-time vital sign estimation protoype is developed using neural network- based localization, multi-object tracking, and embedded processing optimizations. The system estimates heart and respiration rates of multiple subjects using directional of arrival techniques on RADAR data. This system is useful in many civilian and military applications including search and rescue.

The primary contribution from this work is the implementation and benchmarking of neural networks for real time detection and localization on various systems including the testing of eight neural networks on a discrete GPU and Jetson Xavier devices. Mean average precision (mAP) and inference speed benchmarks were performed. …


A Secure Architecture For Defense Against Return Address Corruption, Grayson J. Bruner May 2021

A Secure Architecture For Defense Against Return Address Corruption, Grayson J. Bruner

Masters Theses

The advent of the Internet of Things has brought about a staggering level of inter-connectivity between common devices used every day. Unfortunately, security is not a high priority for developers designing these IoT devices. Often times the trade-off of security comes at too high of a cost in other areas, such as performance or power consumption. This is especially prevalent in resource-constrained devices, which make up a large number of IoT devices. However, a lack of security could lead to a cascade of security breaches rippling through connected devices. One of the most common attacks used by hackers is return …


Automated Program Profiling And Analysis For Managing Heterogeneous Memory Systems, Adam Palmer Howard Dec 2017

Automated Program Profiling And Analysis For Managing Heterogeneous Memory Systems, Adam Palmer Howard

Masters Theses

Many promising memory technologies, such as non-volatile, storage-class memories and high-bandwidth, on-chip RAMs, are beginning to emerge. Since each of these new technologies present tradeoffs distinct from conventional DRAMs, next-generation systems are likely to include multiple tiers of memory storage, each with their own type of devices. To efficiently utilize the available hardware, such systems will need to alter their data management strategies to consider the performance and capabilities provided by each tier.

This work explores a variety of cross-layer strategies for managing application data in heterogeneous memory systems. We propose different program profiling-based techniques to automatically partition program allocation …


Tiled Danna: Dynamic Adaptive Neural Network Array Scaled Across Multiple Chips, Patricia Jean Eckhart Aug 2017

Tiled Danna: Dynamic Adaptive Neural Network Array Scaled Across Multiple Chips, Patricia Jean Eckhart

Masters Theses

Tiled Dynamic Adaptive Neural Network Array(Tiled DANNA) is a recurrent spiking neural network structure composed of programmable biologically inspired neurons and synapses that scales across multiple FPGA chips. Fire events that occur on and within DANNA initiate spiking behaviors in the programmable elements allowing DANNA to hold memory through the synaptic charge propagation and neuronal charge accumulation. DANNA is a fully digital neuromorphic computing structure based on the NIDA architecture. To support initial prototyping and testing of the Tiled DANNA, multiple Xilinx Virtex 7 690Ts were leveraged. The primary goal of Tiled DANNA is to support scaling of DANNA neural …


Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young Aug 2017

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young

Masters Theses

Field-programmable gate arrays (FPGA), application-specific integrated circuits (ASIC), and other chip/multi-chip level implementations can be used to implement Dynamic Adaptive Neural Network Arrays (DANNA). In some applications, DANNA interfaces with a traditional computing system to provide neural network configuration information, provide network input, process network outputs, and monitor the state of the network. The present host-to-DANNA network communication setup uses a Cypress USB 3.0 peripheral controller (FX3) to enable host-to-array communication over USB 3.0. This communications setup has to run commands in batches and does not have enough bandwidth to meet the maximum throughput requirements of the DANNA device, resulting …


Optimization Of Spatial Convolution In Convnets On Intel Knl, Sangamesh Nagashattappa Ragate May 2017

Optimization Of Spatial Convolution In Convnets On Intel Knl, Sangamesh Nagashattappa Ragate

Masters Theses

Most of the experts admit that the true behavior of the neural network is hard to predict. It is quite impossible to deterministically prove the working of the neural network as the architecture gets bigger, yet, it is observed that it is possible to apply a well engineered network to solve one of the most abstract problems like image recognition with substantial accuracy. It requires enormous amount of training of a considerably big and complex neural network to understand its behavior and iteratively improve its accuracy in solving a certain problem. Deep Neural Networks, which are fairly popular nowadays deal …


The Design And Validation Of A Wireless Bat-Mounted Sonar Recording System, Jeremy Joseph Langford Aug 2016

The Design And Validation Of A Wireless Bat-Mounted Sonar Recording System, Jeremy Joseph Langford

Masters Theses

Scientists studying the behavior of bats monitor their echolocation calls, as their calls are important for navigation and feeding, but scientist are typically restricted to ground-based recording. Recording bat calls used for echolocation from the back of the bat as opposed to the ground offers the opportunity to study bat echolocation from a vantage otherwise only offered to the bats themselves. However, designing a bat mounted in-flight audio recording system, (bat-tag), capable of recording the ultra-sound used in bat echolocation presents a unique set of challenges. Chiefly, the bat-tag must be sufficiently light weight as to not overburden the bat, …


An Application Of The Universal Verification Methodology, Rui Ma Aug 2016

An Application Of The Universal Verification Methodology, Rui Ma

Masters Theses

The Universal Verification Methodology (UVM) package is an open-source SystemVerilog library, which is used to set up a class-based hierarchical testbench. UVM testbenches improve the reusability of Verilog testbenches. Direct Memory Access (DMA) plays an important role in modern computer architecture. When using DMA to transfer data between a host machine and field-programmable gate array (FPGA) accelerator, a modularized DMA core on the FPGA frees the host side Central Processing Unit(CPU) during the transfer, helps to save FPGA resources, and enhances performance. Verifying the functionality of a DMA core is essential before mapping it to the FPGA. In this thesis, …


Dividing And Conquering Meshes Within The Nist Fire Dynamics Simulator (Fds) On Multicore Computing Systems, Donald Charles Collins Dec 2015

Dividing And Conquering Meshes Within The Nist Fire Dynamics Simulator (Fds) On Multicore Computing Systems, Donald Charles Collins

Masters Theses

The National Institute for Standards and Technology (NIST) Fire Dynamics Simulator (FDS) provides a computational fluid dynamics model of a fire, which can be visualized by using NIST Smokeview (SMV). Users must create a configuration file (*.fds) that describes the environment and other characteristics of the fire scene so that the FDS software can produce the output file (*.smv) needed for visualization.The processing can be computationally intensive, often taking between several minutes and several hours to complete. In many cases, a user will create a file that is not optimized for a multicore computing system. By dividing meshes within the …


Implementation Of A Neuromorphic Development Platform With Danna, Jason Yen-Shen Chan Dec 2015

Implementation Of A Neuromorphic Development Platform With Danna, Jason Yen-Shen Chan

Masters Theses

Neuromorphic computing is the use of artificial neural networks to solve complex problems. The specialized computing field has been growing in interest during the past few years. Specialized hardware that function as neural networks can be utilized to solve specific problems unsuited for traditional computing architectures such as pattern classification and image recognition. However, these hardware platforms have neural network structures that are static, being limited to only perform a specific application, and cannot be used for other tasks. In this paper, the feasibility of a development platform utilizing a dynamic artificial neural network for researchers is discussed.


Middleware And Services For Dynamic Adaptive Neural Network Arrays, Joshua Caleb Willis Aug 2015

Middleware And Services For Dynamic Adaptive Neural Network Arrays, Joshua Caleb Willis

Masters Theses

Dynamic Adaptive Neural Network Arrays (DANNAs) are neuromorphic systems that exhibit spiking behaviors and can be designed using evolutionary optimization. Array elements are rapidly reconfigurable and can function as either neurons or synapses with programmable interconnections and parameters. Visualization applications can examine DANNA element connections, parameters, and functionality, and evolutionary optimization applications can utilize DANNA to speedup neural network simulations. To facilitate interactions with DANNAs from these applications, we have developed a language-agnostic application programming interface (API) that abstracts away low-level communication details with a DANNA and provides a high-level interface for reprogramming and controlling a DANNA. The library has …


A Lean Information Management Model For Efficient Operations Of An Educational Entity At The University Of Tennessee, Harshitha Muppaneni Aug 2014

A Lean Information Management Model For Efficient Operations Of An Educational Entity At The University Of Tennessee, Harshitha Muppaneni

Masters Theses

A software based Management Information System (MIS) is designed and implemented in the Department of Industrial and Systems Engineering at University of Tennessee to handle different types of data requests that are currently processed through multiple steps. This thesis addresses the current resource intensive data management model in educational institutions and proposes a decentralized and customized solution. The proposed software based data management system provides information to authorized sources in the requested format with minimal or no time consumption. The quantification of the new systems’ impact is done by comparing it with current data management process using Graph Theoretic Approach …


A Privacy-Aware Distributed Storage And Replication Middleware For Heterogeneous Computing Platform, Jilong Liao Dec 2013

A Privacy-Aware Distributed Storage And Replication Middleware For Heterogeneous Computing Platform, Jilong Liao

Masters Theses

Cloud computing is an emerging research area that has drawn considerable interest in recent years. However, the current infrastructure raises significant concerns about how to protect users' privacy, in part due to that users are storing their data in the cloud vendors' servers. In this paper, we address this challenge by proposing and implementing a novel middleware, called Uno, which separates the storage of physical data and their associated metadata. In our design, users' physical data are stored locally on those devices under a user's full control, while their metadata can be uploaded to the commercial cloud. To ensure the …


A Secure Reconfigurable System-On-Programmable-Chip Computer System, William Herbert Collins Aug 2013

A Secure Reconfigurable System-On-Programmable-Chip Computer System, William Herbert Collins

Masters Theses

A System-on-Programmable-Chip (SoPC) architecture is designed to meet two goals: to provide a role-based secure computing environment and to allow for user reconfiguration. To accomplish this, a secure root of trust is derived from a fixed architectural subsystem, known as the Security Controller. It additionally provides a dynamically configurable single point of access between applications developed by users and the objects those applications use. The platform provides a model for secrecy such that physical recovery of any one component in isolation does not compromise the system. Dual-factor authentication is used to verify users. A model is also provided for tamper …


Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton May 2013

Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton

Masters Theses

The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (GEMM) routine. This obsession is not without reason. Most, if not all, Level 3 Basic Linear Algebra Subroutines (BLAS) can be written in terms of GEMM, and many of the higher level linear algebra solvers' (i.e., LU, Cholesky) performance depend on GEMM's performance. Getting high performance on GEMM is highly architecture dependent, and so for each new architecture that comes out, GEMM has to be programmed and tested to achieve maximal performance. Also, with emergent computer architectures featuring more vector-based and multi to many-core processors, GEMM performance …


Power Management For Gpu-Cpu Heterogeneous Systems, Xue Li Dec 2011

Power Management For Gpu-Cpu Heterogeneous Systems, Xue Li

Masters Theses

In recent years, GPU-CPU heterogeneous architectures have been increasingly adopted in high performance computing, because of their capabilities of providing high computational throughput. However, current research focuses mainly on the performance aspects of GPU-CPU architectures, while improving the energy efficiency of such systems receives much less attention. There are few existing efforts that try to lower the energy consumption of GPU-CPU architectures, but they address either GPU or CPU in an isolated manner and thus cannot achieve maximized energy savings. In this paper, we propose GreenGPU, a holistic energy management framework for GPU-CPU heterogeneous architectures. Our solution features a two-tier …


Cyber Profiling For Insider Threat Detection, Akaninyene Walter Udoeyop Aug 2010

Cyber Profiling For Insider Threat Detection, Akaninyene Walter Udoeyop

Masters Theses

Cyber attacks against companies and organizations can result in high impact losses that include damaged credibility, exposed vulnerability, and financial losses. Until the 21st century, insiders were often overlooked as suspects for these attacks. The 2010 CERT Cyber Security Watch Survey attributes 26 percent of cyber crimes to insiders. Numerous real insider attack scenarios suggest that during, or directly before the attack, the insider begins to behave abnormally. We introduce a method to detect abnormal behavior by profiling users. We utilize the k-means and kernel density estimation algorithms to learn a user’s normal behavior and establish normal user profiles based …


Performance Evaluation Of Memory And Computationally Bound Chemistry Applications On Streaming Gpgpus And Multi-Core X86 Cpus, Frederick E. Weber Iii May 2010

Performance Evaluation Of Memory And Computationally Bound Chemistry Applications On Streaming Gpgpus And Multi-Core X86 Cpus, Frederick E. Weber Iii

Masters Theses

In recent years, multi-core processors have come to dominate the field in desktop and high performance computing. Graphics processors traditionally used in CAD, video games, and other 3-d applications, have become more programmable and are now suitable for general purpose computing. This thesis explores multi-core processors and GPU performance and limitations in two computational chemistry applications: a memory bound component of ab-initio modeling and a computationally bound Monte Carlo simulation. For the applications presented in this thesis, exploiting multiple processors is done using a variety of tools and languages including OpenMP and MKL. Brook+ and the Compute Abstraction Layer streaming …