Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Engineering

Digital Simulations Of Memristors Towards Integration With Reconfigurable Computing, Ivris Raymond May 2023

Digital Simulations Of Memristors Towards Integration With Reconfigurable Computing, Ivris Raymond

Computer Science and Computer Engineering Undergraduate Honors Theses

The end of Moore’s Law has been predicted for decades. Demand for increased parallel computational performance has been increased by improvements in machine learning. This past decade has demonstrated the ever-increasing creativity and effort necessary to extract scaling improvements in CMOS fabrication processes. However, CMOS scaling is nearing its fundamental physical limits. A viable path for increasing performance is to break the von Neumann bottleneck. In-memory computing using emerging memory technologies (e.g. ReRam, STT, MRAM) offers a potential path beyond the end of Moore’s Law. However, there is currently very little support from industry tools for designers wishing to incorporate …


Reconfigurable Technologies For Next Generation Internet And Cluster Computing, Deepak C. Unnikrishnan Sep 2013

Reconfigurable Technologies For Next Generation Internet And Cluster Computing, Deepak C. Unnikrishnan

Open Access Dissertations

Modern web applications are marked by distinct networking and computing characteristics. As applications evolve, they continue to operate over a large monolithic framework of networking and computing equipment built from general-purpose microprocessors and Application Specific Integrated Circuits (ASICs) that offers few architectural choices. This dissertation presents techniques to diversify the next-generation Internet infrastructure by integrating Field-programmable Gate Arrays (FPGAs), a class of reconfigurable integrated circuits, with general-purpose microprocessor-based techniques. Specifically, our solutions are demonstrated in the context of two applications - network virtualization and distributed cluster computing.

Network virtualization enables the physical network infrastructure to be shared among several …


A Parameterized Stereo Vision Core For Fpgas, Mark Chang, Stephen Longfield Jul 2012

A Parameterized Stereo Vision Core For Fpgas, Mark Chang, Stephen Longfield

Mark L. Chang

We present a parameterized stereo vision core suitable for a wide range of FPGA targets and stereo vision applications. By enabling easy tuning of algorithm parameters, our system allows for rapid exploration of the design space and simpler implementation of high-performance stereo vision systems. This implementation utilizes the census transform algorithm to calculate depth information from a pair of images delivered from a simulated stereo camera pair. This work advances our previous work through implementation improvements, a stereo camera pair simulation framework, and a scalable stereo vision core.


Extending The Hybridthread Smp Model For Distributed Memory Systems, Eugene Anthony Cartwright Iii May 2012

Extending The Hybridthread Smp Model For Distributed Memory Systems, Eugene Anthony Cartwright Iii

Graduate Theses and Dissertations

Memory Hierarchy is of growing importance in system design today. As Moore's Law allows system designers to include more processors within their designs, data locality becomes a priority. Traditional multiprocessor systems on chip (MPSoC) experience difficulty scaling as the quantity of processors increases. This challenge is common behavior of memory accesses in a shared memory environment and causes a decrease in memory bandwidth as processor numbers increase. In order to provide the necessary levels of scalability, the computer architecture community has sought to decentralize memory accesses by distributing memory throughout the system. Distributed memory offers greater bandwidth due to decoupled …


An Adaptive Modular Redundancy Technique To Self-Regulate Availability, Area, And Energy Consumption In Mission-Critical Applications, Rawad N. Al-Haddad Jan 2011

An Adaptive Modular Redundancy Technique To Self-Regulate Availability, Area, And Energy Consumption In Mission-Critical Applications, Rawad N. Al-Haddad

Electronic Theses and Dissertations

As reconfigurable devices' capacities and the complexity of applications that use them increase, the need for self-reliance of deployed systems becomes increasingly prominent. A Sustainable Modular Adaptive Redundancy Technique (SMART) composed of a dual-layered organic system is proposed, analyzed, implemented, and experimentally evaluated. SMART relies upon a variety of self-regulating properties to control availability, energy consumption, and area used, in dynamically-changing environments that require high degree of adaptation. The hardware layer is implemented on a Xilinx Virtex-4 Field Programmable Gate Array (FPGA) to provide self-repair using a novel approach called a Reconfigurable Adaptive Redundancy System (RARS). The software layer supervises …


Cost And Performance Modeling Of The Mu-Decoder, Raghavendra Kongari Jan 2011

Cost And Performance Modeling Of The Mu-Decoder, Raghavendra Kongari

LSU Master's Theses

In this thesis we study the implementation details of the MU-Decoders, a recently proposed hardware module that has been theoretically shown to be superior to other methods for generating subsets of large sets. Our study confirms this advantage. Specifically, we compare the performance of implementations of the LUT-Decoder (the most common configurable decoder) to the MU-Decoder. We show that for while the LUT-Decoder is slightly better than the MU-Decoder for arbitrary (and artificial) inputs, for a large class of inputs called totally ordered subsets, that have practical significance, the MU-Decoder is vastly superior in area to the LUT-Decoder. In terms …


Analysis Of Field Programmable Gate Array-Based Kalman Filter Architectures, Arvind Sudarsanam Dec 2010

Analysis Of Field Programmable Gate Array-Based Kalman Filter Architectures, Arvind Sudarsanam

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A Field Programmable Gate Array (FPGA)-based Polymorphic Faddeev Systolic Array (PolyFSA) architecture is proposed to accelerate an Extended Kalman Filter (EKF) algorithm. A system architecture comprising a software processor as the host processor, a hardware controller, a cache-based memory sub-system, and the proposed PolyFSA as co-processor, is presented. PolyFSA-based system architecture is implemented on a Xilinx Virtex 4 family of FPGAs. Results indicate significant speed-ups for the proposed architecture when compared against a space-based software processor. This dissertation proposes a comprehensive architecture analysis that is comprised of (i) error analysis, (ii) performance analysis, and (iii) area analysis. Results are presented …


Exploiting Matrix Symmetry To Improve Fpgaaccelerated Conjugate Gradient, Jason D. Bakos, Krishna K. Nagar Apr 2009

Exploiting Matrix Symmetry To Improve Fpgaaccelerated Conjugate Gradient, Jason D. Bakos, Krishna K. Nagar

Faculty Publications

In this paper we describe a new approach for accelerating the Conjugate Gradient (CG) method using an FPGA co-processor. As in previous approaches, our co-processor performs a double-precision sparse matrix-vector multiplication. However, our implementation doubles the amount of computation per unit of input data by exploiting the symmetry of the input matrix and computing the upper and lower triangle of the input matrix in parallel. Using a Virtex-2 Pro 100 FPGA, we have achieved an observed computational throughput of 1155 MFLOPS.


Fpga Acceleration Of A Cortical And A Matched Filter-Based Algorithm, Kenneth Rice Jul 2008

Fpga Acceleration Of A Cortical And A Matched Filter-Based Algorithm, Kenneth Rice

All Theses

Digital image processing is a widely used and diverse field. It is used in a broad array of areas such as tracking and detection, object avoidance, computer vision, and numerous other applications. For many image processing tasks, the computations can become time consuming. Therefore, a means for accelerating the computations would be beneficial. Using that as motivation, this thesis examines the acceleration of two distinctly different image processing applications. The first image processing application examined is a recent neocortex inspired cognitive model geared towards pattern recognition as seen in the visual cortex. For this model, both software and reconfigurable logic …


A Configurable Decoder For Pin-Limited Applications, Matthew Collin Jordan Jan 2006

A Configurable Decoder For Pin-Limited Applications, Matthew Collin Jordan

LSU Master's Theses

Pin limitation is the restriction imposed on an IC chip by the unavailability of a sufficient number of I/O pins. This impacts the design and performance of the chip, as the amount of information that can be passed through the boundary of the chip becomes limited. One area that would benefit from a reduction of the effect of pin limitation is reconfigurable architectures. In this work, we consider reconfigurable devices called Field Programmable Gate Arrays (FPGAs). Due to pin limitation, current FPGAs use a form of 1-hot decoder to select elements (one frame at a time) during partial reconfiguration. This …


Designing Switches For Routing In Circuit-Switched Trees, Dinesh Prasad Venkat Rao Jan 2006

Designing Switches For Routing In Circuit-Switched Trees, Dinesh Prasad Venkat Rao

LSU Master's Theses

Reconfigurable computing provides a fast and flexible solution for intensive computing processes. Thus, it acts as a bridge between software controlled and hardware based processors. The self–reconfigurable gate array (SRGA) is a reconfigurable architecture that allows fast switching between operations on a reconfigurable device. It consists of a 2-dimensional array of processing elements (PEs) connected using a binary tree structure, called a circuit-switched tree (CST). A CST is a balanced binary tree in which leaves represent processing elements (PE) and internal nodes represent switches. The PEs in the CST communicates with each other by configuring the appropriate switches in the …