Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 32

Full-Text Articles in Engineering

Benchmarking Of Embedded Object Detection In Optical And Radar Scenes, Vijaysrinivas Rajagopal Dec 2022

Benchmarking Of Embedded Object Detection In Optical And Radar Scenes, Vijaysrinivas Rajagopal

Masters Theses

A portable, real-time vital sign estimation protoype is developed using neural network- based localization, multi-object tracking, and embedded processing optimizations. The system estimates heart and respiration rates of multiple subjects using directional of arrival techniques on RADAR data. This system is useful in many civilian and military applications including search and rescue.

The primary contribution from this work is the implementation and benchmarking of neural networks for real time detection and localization on various systems including the testing of eight neural networks on a discrete GPU and Jetson Xavier devices. Mean average precision (mAP) and inference speed benchmarks were performed. …


Profile-Guided Data Management For Heterogeneous Memory Systems, Matthew B. Olson Dec 2021

Profile-Guided Data Management For Heterogeneous Memory Systems, Matthew B. Olson

Doctoral Dissertations

Market forces and technological constraints have led to a gap between CPU and memory performance that has widened for decades. While processor scaling has plateaued in recent years, this gap persists and is not expected to diminish for the foreseeable future. This discrepancy presents a host of challenges for scaling application performance, which have only been exacerbated in recent years, as increasing demands for fast and effective data analytics are driving memory energy, bandwidth, and capacity requirements to new heights.

To address these trends, hardware architects have introduced a plethora of memory technologies. For example, most modern memory systems include …


Toward Reliable And Efficient Message Passing Software For Hpc Systems: Fault Tolerance And Vector Extension, Dong Zhong Aug 2021

Toward Reliable And Efficient Message Passing Software For Hpc Systems: Fault Tolerance And Vector Extension, Dong Zhong

Doctoral Dissertations

As the scale of High-performance Computing (HPC) systems continues to grow, researchers are devoted themselves to achieve the best performance of running long computing jobs on these systems. My research focus on reliability and efficiency study for HPC software.

First, as systems become larger, mean-time-to-failure (MTTF) of these HPC systems is negatively impacted and tends to decrease. Handling system failures becomes a prime challenge. My research aims to present a general design and implementation of an efficient runtime-level failure detection and propagation strategy targeting large-scale, dynamic systems that is able to detect both node and process failures. Using multiple overlapping …


Human Fatigue Predictions In Complex Aviation Crew Operational Impact Conditions, Suresh Rangan May 2021

Human Fatigue Predictions In Complex Aviation Crew Operational Impact Conditions, Suresh Rangan

Doctoral Dissertations

In this last decade, several regulatory frameworks across the world in all modes of transportation had brought fatigue and its risk management in operations to the forefront. Of all transportation modes air travel has been the safest means of transportation. Still as part of continuous improvement efforts, regulators are insisting the operators to adopt strong fatigue science and its foundational principles to reinforce safety risk assessment and management. Fatigue risk management is a data driven system that finds a realistic balance between safety and productivity in an organization. This work discusses the effects of mathematical modeling of fatigue and its …


A Secure Architecture For Defense Against Return Address Corruption, Grayson J. Bruner May 2021

A Secure Architecture For Defense Against Return Address Corruption, Grayson J. Bruner

Masters Theses

The advent of the Internet of Things has brought about a staggering level of inter-connectivity between common devices used every day. Unfortunately, security is not a high priority for developers designing these IoT devices. Often times the trade-off of security comes at too high of a cost in other areas, such as performance or power consumption. This is especially prevalent in resource-constrained devices, which make up a large number of IoT devices. However, a lack of security could lead to a cascade of security breaches rippling through connected devices. One of the most common attacks used by hackers is return …


Automated Program Profiling And Analysis For Managing Heterogeneous Memory Systems, Adam Palmer Howard Dec 2017

Automated Program Profiling And Analysis For Managing Heterogeneous Memory Systems, Adam Palmer Howard

Masters Theses

Many promising memory technologies, such as non-volatile, storage-class memories and high-bandwidth, on-chip RAMs, are beginning to emerge. Since each of these new technologies present tradeoffs distinct from conventional DRAMs, next-generation systems are likely to include multiple tiers of memory storage, each with their own type of devices. To efficiently utilize the available hardware, such systems will need to alter their data management strategies to consider the performance and capabilities provided by each tier.

This work explores a variety of cross-layer strategies for managing application data in heterogeneous memory systems. We propose different program profiling-based techniques to automatically partition program allocation …


Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu Aug 2017

Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu

Doctoral Dissertations

This dissertation presents wide-area measurement-driven approaches for power system modeling and analytics. Accurate power system dynamic models are the very basis of power system analysis, control, and operation. Meanwhile, phasor measurement data provide first-hand knowledge of power system dynamic behaviors. The idea of building out innovative applications with synchrophasor data is promising.

Taking advantage of the real-time wide-area measurements, one of phasor measurements’ novel applications is to develop a synchrophasor-based auto-regressive with exogenous inputs (ARX) model that can be updated online to estimate or predict system dynamic responses.

Furthermore, since auto-regressive models are in a big family, the ARX model …


Tiled Danna: Dynamic Adaptive Neural Network Array Scaled Across Multiple Chips, Patricia Jean Eckhart Aug 2017

Tiled Danna: Dynamic Adaptive Neural Network Array Scaled Across Multiple Chips, Patricia Jean Eckhart

Masters Theses

Tiled Dynamic Adaptive Neural Network Array(Tiled DANNA) is a recurrent spiking neural network structure composed of programmable biologically inspired neurons and synapses that scales across multiple FPGA chips. Fire events that occur on and within DANNA initiate spiking behaviors in the programmable elements allowing DANNA to hold memory through the synaptic charge propagation and neuronal charge accumulation. DANNA is a fully digital neuromorphic computing structure based on the NIDA architecture. To support initial prototyping and testing of the Tiled DANNA, multiple Xilinx Virtex 7 690Ts were leveraged. The primary goal of Tiled DANNA is to support scaling of DANNA neural …


Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young Aug 2017

Scalable High-Speed Communications For Neuromorphic Systems, Aaron Reed Young

Masters Theses

Field-programmable gate arrays (FPGA), application-specific integrated circuits (ASIC), and other chip/multi-chip level implementations can be used to implement Dynamic Adaptive Neural Network Arrays (DANNA). In some applications, DANNA interfaces with a traditional computing system to provide neural network configuration information, provide network input, process network outputs, and monitor the state of the network. The present host-to-DANNA network communication setup uses a Cypress USB 3.0 peripheral controller (FX3) to enable host-to-array communication over USB 3.0. This communications setup has to run commands in batches and does not have enough bandwidth to meet the maximum throughput requirements of the DANNA device, resulting …


Optimization Of Spatial Convolution In Convnets On Intel Knl, Sangamesh Nagashattappa Ragate May 2017

Optimization Of Spatial Convolution In Convnets On Intel Knl, Sangamesh Nagashattappa Ragate

Masters Theses

Most of the experts admit that the true behavior of the neural network is hard to predict. It is quite impossible to deterministically prove the working of the neural network as the architecture gets bigger, yet, it is observed that it is possible to apply a well engineered network to solve one of the most abstract problems like image recognition with substantial accuracy. It requires enormous amount of training of a considerably big and complex neural network to understand its behavior and iteratively improve its accuracy in solving a certain problem. Deep Neural Networks, which are fairly popular nowadays deal …


Context-Sensitive Auto-Sanitization For Php, Jared M. Smith, Richard J. Connor, David P. Cunningham, Kyle G. Bashour, Walter T. Work Dec 2016

Context-Sensitive Auto-Sanitization For Php, Jared M. Smith, Richard J. Connor, David P. Cunningham, Kyle G. Bashour, Walter T. Work

Chancellor’s Honors Program Projects

No abstract provided.


Achieving High Reliability And Efficiency In Maintaining Large-Scale Storage Systems Through Optimal Resource Provisioning And Data Placement, Lipeng Wan Aug 2016

Achieving High Reliability And Efficiency In Maintaining Large-Scale Storage Systems Through Optimal Resource Provisioning And Data Placement, Lipeng Wan

Doctoral Dissertations

With the explosive increase in the amount of data being generated by various applications, large-scale distributed and parallel storage systems have become common data storage solutions and been widely deployed and utilized in both industry and academia. While these high performance storage systems significantly accelerate the data storage and retrieval, they also bring some critical issues in system maintenance and management. In this dissertation, I propose three methodologies to address three of these critical issues.

First, I develop an optimal resource management and spare provisioning model to minimize the impact brought by component failures and ensure a highly operational experience …


An Application Of The Universal Verification Methodology, Rui Ma Aug 2016

An Application Of The Universal Verification Methodology, Rui Ma

Masters Theses

The Universal Verification Methodology (UVM) package is an open-source SystemVerilog library, which is used to set up a class-based hierarchical testbench. UVM testbenches improve the reusability of Verilog testbenches. Direct Memory Access (DMA) plays an important role in modern computer architecture. When using DMA to transfer data between a host machine and field-programmable gate array (FPGA) accelerator, a modularized DMA core on the FPGA frees the host side Central Processing Unit(CPU) during the transfer, helps to save FPGA resources, and enhances performance. Verifying the functionality of a DMA core is essential before mapping it to the FPGA. In this thesis, …


The Design And Validation Of A Wireless Bat-Mounted Sonar Recording System, Jeremy Joseph Langford Aug 2016

The Design And Validation Of A Wireless Bat-Mounted Sonar Recording System, Jeremy Joseph Langford

Masters Theses

Scientists studying the behavior of bats monitor their echolocation calls, as their calls are important for navigation and feeding, but scientist are typically restricted to ground-based recording. Recording bat calls used for echolocation from the back of the bat as opposed to the ground offers the opportunity to study bat echolocation from a vantage otherwise only offered to the bats themselves. However, designing a bat mounted in-flight audio recording system, (bat-tag), capable of recording the ultra-sound used in bat echolocation presents a unique set of challenges. Chiefly, the bat-tag must be sufficiently light weight as to not overburden the bat, …


Arithmetic Logic Unit Architectures With Dynamically Defined Precision, Getao Liang Dec 2015

Arithmetic Logic Unit Architectures With Dynamically Defined Precision, Getao Liang

Doctoral Dissertations

Modern central processing units (CPUs) employ arithmetic logic units (ALUs) that support statically defined precisions, often adhering to industry standards. Although CPU manufacturers highly optimize their ALUs, industry standard precisions embody accuracy and performance compromises for general purpose deployment. Hence, optimizing ALU precision holds great potential for improving speed and energy efficiency. Previous research on multiple precision ALUs focused on predefined, static precisions. Little previous work addressed ALU architectures with customized, dynamically defined precision. This dissertation presents approaches for developing dynamic precision ALU architectures for both fixed-point and floating-point to enable better performance, energy efficiency, and numeric accuracy. These new …


Dividing And Conquering Meshes Within The Nist Fire Dynamics Simulator (Fds) On Multicore Computing Systems, Donald Charles Collins Dec 2015

Dividing And Conquering Meshes Within The Nist Fire Dynamics Simulator (Fds) On Multicore Computing Systems, Donald Charles Collins

Masters Theses

The National Institute for Standards and Technology (NIST) Fire Dynamics Simulator (FDS) provides a computational fluid dynamics model of a fire, which can be visualized by using NIST Smokeview (SMV). Users must create a configuration file (*.fds) that describes the environment and other characteristics of the fire scene so that the FDS software can produce the output file (*.smv) needed for visualization.The processing can be computationally intensive, often taking between several minutes and several hours to complete. In many cases, a user will create a file that is not optimized for a multicore computing system. By dividing meshes within the …


Implementation Of A Neuromorphic Development Platform With Danna, Jason Yen-Shen Chan Dec 2015

Implementation Of A Neuromorphic Development Platform With Danna, Jason Yen-Shen Chan

Masters Theses

Neuromorphic computing is the use of artificial neural networks to solve complex problems. The specialized computing field has been growing in interest during the past few years. Specialized hardware that function as neural networks can be utilized to solve specific problems unsuited for traditional computing architectures such as pattern classification and image recognition. However, these hardware platforms have neural network structures that are static, being limited to only perform a specific application, and cannot be used for other tasks. In this paper, the feasibility of a development platform utilizing a dynamic artificial neural network for researchers is discussed.


Middleware And Services For Dynamic Adaptive Neural Network Arrays, Joshua Caleb Willis Aug 2015

Middleware And Services For Dynamic Adaptive Neural Network Arrays, Joshua Caleb Willis

Masters Theses

Dynamic Adaptive Neural Network Arrays (DANNAs) are neuromorphic systems that exhibit spiking behaviors and can be designed using evolutionary optimization. Array elements are rapidly reconfigurable and can function as either neurons or synapses with programmable interconnections and parameters. Visualization applications can examine DANNA element connections, parameters, and functionality, and evolutionary optimization applications can utilize DANNA to speedup neural network simulations. To facilitate interactions with DANNAs from these applications, we have developed a language-agnostic application programming interface (API) that abstracts away low-level communication details with a DANNA and provides a high-level interface for reprogramming and controlling a DANNA. The library has …


A Lean Information Management Model For Efficient Operations Of An Educational Entity At The University Of Tennessee, Harshitha Muppaneni Aug 2014

A Lean Information Management Model For Efficient Operations Of An Educational Entity At The University Of Tennessee, Harshitha Muppaneni

Masters Theses

A software based Management Information System (MIS) is designed and implemented in the Department of Industrial and Systems Engineering at University of Tennessee to handle different types of data requests that are currently processed through multiple steps. This thesis addresses the current resource intensive data management model in educational institutions and proposes a decentralized and customized solution. The proposed software based data management system provides information to authorized sources in the requested format with minimal or no time consumption. The quantification of the new systems’ impact is done by comparing it with current data management process using Graph Theoretic Approach …


Ecocar2 Center Stack Development, Westley Logan Harris, Chris Winstead, Nicholas Alexander Cavopol, William Willie Wells, Tate Glick Hawkersmith May 2014

Ecocar2 Center Stack Development, Westley Logan Harris, Chris Winstead, Nicholas Alexander Cavopol, William Willie Wells, Tate Glick Hawkersmith

Chancellor’s Honors Program Projects

No abstract provided.


A Privacy-Aware Distributed Storage And Replication Middleware For Heterogeneous Computing Platform, Jilong Liao Dec 2013

A Privacy-Aware Distributed Storage And Replication Middleware For Heterogeneous Computing Platform, Jilong Liao

Masters Theses

Cloud computing is an emerging research area that has drawn considerable interest in recent years. However, the current infrastructure raises significant concerns about how to protect users' privacy, in part due to that users are storing their data in the cloud vendors' servers. In this paper, we address this challenge by proposing and implementing a novel middleware, called Uno, which separates the storage of physical data and their associated metadata. In our design, users' physical data are stored locally on those devices under a user's full control, while their metadata can be uploaded to the commercial cloud. To ensure the …


A Secure Reconfigurable System-On-Programmable-Chip Computer System, William Herbert Collins Aug 2013

A Secure Reconfigurable System-On-Programmable-Chip Computer System, William Herbert Collins

Masters Theses

A System-on-Programmable-Chip (SoPC) architecture is designed to meet two goals: to provide a role-based secure computing environment and to allow for user reconfiguration. To accomplish this, a secure root of trust is derived from a fixed architectural subsystem, known as the Security Controller. It additionally provides a dynamically configurable single point of access between applications developed by users and the objects those applications use. The platform provides a model for secrecy such that physical recovery of any one component in isolation does not compromise the system. Dual-factor authentication is used to verify users. A model is also provided for tamper …


Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton May 2013

Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton

Masters Theses

The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (GEMM) routine. This obsession is not without reason. Most, if not all, Level 3 Basic Linear Algebra Subroutines (BLAS) can be written in terms of GEMM, and many of the higher level linear algebra solvers' (i.e., LU, Cholesky) performance depend on GEMM's performance. Getting high performance on GEMM is highly architecture dependent, and so for each new architecture that comes out, GEMM has to be programmed and tested to achieve maximal performance. Also, with emergent computer architectures featuring more vector-based and multi to many-core processors, GEMM performance …


Exploring Computational Chemistry On Emerging Architectures, David Dewayne Jenkins Dec 2012

Exploring Computational Chemistry On Emerging Architectures, David Dewayne Jenkins

Doctoral Dissertations

Emerging architectures, such as next generation microprocessors, graphics processing units, and Intel MIC cards, are being used with increased popularity in high performance computing. Each of these architectures has advantages over previous generations of architectures including performance, programmability, and power efficiency. With the ever-increasing performance of these architectures, scientific computing applications are able to attack larger, more complicated problems. However, since applications perform differently on each of the architectures, it is difficult to determine the best tool for the job. This dissertation makes the following contributions to computer engineering and computational science. First, this work implements the computational chemistry variational …


Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber Dec 2012

Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber

Doctoral Dissertations

In recent years, Graphics Processing Units (GPUs) have piqued the interest of researchers in scientific computing. Their immense floating point throughput and massive parallelism make them ideal for not just graphical applications, but many general algorithms as well. Load balancing applications and taking advantage of all computational resources in a machine is a difficult challenge, especially when the resources are heterogeneous. This dissertation presents the clUtil library, which vastly simplifies developing OpenCL applications for heterogeneous systems. The core focus of this dissertation lies in clUtil's ParallelFor construct and our novel PINA scheduler which can efficiently load balance work onto multiple …


Kernel-Assisted And Topology-Aware Mpi Collective Communication Among Multicore Or Many-Core Clusters, Teng Ma Dec 2012

Kernel-Assisted And Topology-Aware Mpi Collective Communication Among Multicore Or Many-Core Clusters, Teng Ma

Doctoral Dissertations

Multicore or many-core clusters have become the most prominent form of High Performance Computing (HPC) systems. Hardware complexity and hierarchies not only exist in the inter-node layer, i.e., hierarchical networks, but also exist in internals of multicore compute nodes, e.g., Non Uniform Memory Accesses (NUMA), network-style interconnect, and memory and shared cache hierarchies.

Message Passing Interface (MPI), the most widely adopted in the HPC communities, suffers from decreased performance and portability due to increased hardware complexity of multiple levels. We identified three critical issues specific to collective communication: The first problem arises from the gap between logical collective topologies and …


Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan Dec 2012

Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan

Doctoral Dissertations

Multicore architectures with high core counts have come to dominate the world of high performance computing, from shared memory machines to the largest distributed memory clusters. The multicore route to increased performance has a simpler design and better power efficiency than the traditional approach of increasing processor frequencies. But, standard programming techniques are not well adapted to this change in computer architecture design.

In this work, we study the use of dynamic runtime environments executing data driven applications as a solution to programming multicore architectures. The goals of our runtime environments are productivity, scalability and performance. We demonstrate productivity by …


Power Management For Gpu-Cpu Heterogeneous Systems, Xue Li Dec 2011

Power Management For Gpu-Cpu Heterogeneous Systems, Xue Li

Masters Theses

In recent years, GPU-CPU heterogeneous architectures have been increasingly adopted in high performance computing, because of their capabilities of providing high computational throughput. However, current research focuses mainly on the performance aspects of GPU-CPU architectures, while improving the energy efficiency of such systems receives much less attention. There are few existing efforts that try to lower the energy consumption of GPU-CPU architectures, but they address either GPU or CPU in an isolated manner and thus cannot achieve maximized energy savings. In this paper, we propose GreenGPU, a holistic energy management framework for GPU-CPU heterogeneous architectures. Our solution features a two-tier …


Performance Controlled Power Optimization For Virtualized Internet Datacenters, Yefu Wang Aug 2011

Performance Controlled Power Optimization For Virtualized Internet Datacenters, Yefu Wang

Doctoral Dissertations

Modern data centers must provide performance assurance for complex system software such as web applications. In addition, the power consumption of data centers needs to be minimized to reduce operating costs and avoid system overheating. In recent years, more and more data centers start to adopt server virtualization strategies for resource sharing to reduce hardware and operating costs by consolidating applications previously running on multiple physical servers onto a single physical server. In this dissertation, several power efficient algorithms are proposed to effectively reduce server power consumption while achieving the required application-level performance for virtualized servers.

First, at the server …


Adaptive Performance And Power Management In Distributed Computing Systems, Ming Chen Aug 2010

Adaptive Performance And Power Management In Distributed Computing Systems, Ming Chen

Doctoral Dissertations

The complexity of distributed computing systems has raised two unprecedented challenges for system management. First, various customers need to be assured by meeting their required service-level agreements such as response time and throughput. Second, system power consumption must be controlled in order to avoid system failures caused by power capacity overload or system overheating due to increasingly high server density. However, most existing work, unfortunately, either relies on open-loop estimations based on off-line profiled system models, or evolves in a more ad hoc fashion, which requires exhaustive iterations of tuning and testing, or oversimplifies the problem by ignoring the coupling …