Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Computer Engineering

Digital Simulations Of Memristors Towards Integration With Reconfigurable Computing, Ivris Raymond May 2023

Digital Simulations Of Memristors Towards Integration With Reconfigurable Computing, Ivris Raymond

Computer Science and Computer Engineering Undergraduate Honors Theses

The end of Moore’s Law has been predicted for decades. Demand for increased parallel computational performance has been increased by improvements in machine learning. This past decade has demonstrated the ever-increasing creativity and effort necessary to extract scaling improvements in CMOS fabrication processes. However, CMOS scaling is nearing its fundamental physical limits. A viable path for increasing performance is to break the von Neumann bottleneck. In-memory computing using emerging memory technologies (e.g. ReRam, STT, MRAM) offers a potential path beyond the end of Moore’s Law. However, there is currently very little support from industry tools for designers wishing to incorporate …


An Optimized And Scalable Blockchain-Based Distributed Learning Platform For Consumer Iot, Zhaocheng Wang, Xueying Liu, Xinming Shao, Abdullah Alghamdi, Md. Shirajum Munir, Sujit Biswas Jan 2023

An Optimized And Scalable Blockchain-Based Distributed Learning Platform For Consumer Iot, Zhaocheng Wang, Xueying Liu, Xinming Shao, Abdullah Alghamdi, Md. Shirajum Munir, Sujit Biswas

School of Cybersecurity Faculty Publications

Consumer Internet of Things (CIoT) manufacturers seek customer feedback to enhance their products and services, creating a smart ecosystem, like a smart home. Due to security and privacy concerns, blockchain-based federated learning (BCFL) ecosystems can let CIoT manufacturers update their machine learning (ML) models using end-user data. Federated learning (FL) uses privacy-preserving ML techniques to forecast customers' needs and consumption habits, and blockchain replaces the centralized aggregator to safeguard the ecosystem. However, blockchain technology (BCT) struggles with scalability and quick ledger expansion. In BCFL, local model generation and secure aggregation are other issues. This research introduces a novel architecture, emphasizing …


Lecture 06: The Impact Of Computer Architectures On The Design Of Algebraic Multigrid Methods, Ulrike Yang Apr 2021

Lecture 06: The Impact Of Computer Architectures On The Design Of Algebraic Multigrid Methods, Ulrike Yang

Mathematical Sciences Spring Lecture Series

Algebraic multigrid (AMG) is a popular iterative solver and preconditioner for large sparse linear systems. When designed well, it is algorithmically scalable, enabling it to solve increasingly larger systems efficiently. While it consists of various highly parallel building blocks, the original method also consisted of various highly sequential components. A large amount of research has been performed over several decades to design new components that perform well on high performance computers. As a matter of fact, AMG has shown to scale well to more than a million processes. However, with single-core speeds plateauing, future increases in computing performance need to …


Otter Vector Extension, Alexis A. Peralta Jun 2020

Otter Vector Extension, Alexis A. Peralta

Computer Engineering

This paper offers an implementation of a subset of the "RISC-V 'V' Vector Extension", v0.7.x. The "RISC-V 'V' Vector Extension" is the proposed vector instruction set for RISC-V open-source architecture. Vectors are inherently data-parallel, allowing for significant performance increases. Vectors have applications in fields such as cryptography, graphics, and machine learning. A vector processing unit was added to Cal Poly's RISC-V multi-cycle architecture, known as the OTTER. Computationally intensive programs running on the OTTER Vector Extension ran over three times faster when compared to the baseline multi-cycle implementation. Memory intensive applications saw similar performance increases.


The Thermal-Constrained Real-Time Systems Design On Multi-Core Platforms -- An Analytical Approach, Shi Sha Mar 2018

The Thermal-Constrained Real-Time Systems Design On Multi-Core Platforms -- An Analytical Approach, Shi Sha

FIU Electronic Theses and Dissertations

Over the past decades, the shrinking transistor size enabled more transistors to be integrated into an IC chip, to achieve higher and higher computing performances. However, the semiconductor industry is now reaching a saturation point of Moore’s Law largely due to soaring power consumption and heat dissipation, among other factors. High chip temperature not only significantly increases packing/cooling cost, degrades system performance and reliability, but also increases the energy consumption and even damages the chip permanently. Although designing 2D and even 3D multi-core processors helps to lower the power/thermal barrier for single-core architectures by exploring the thread/process level parallelism, the …


On High-Performance Parallel Fixed-Point Decimal Multiplier Designs, Ming Zhu Dec 2013

On High-Performance Parallel Fixed-Point Decimal Multiplier Designs, Ming Zhu

UNLV Theses, Dissertations, Professional Papers, and Capstones

High-performance, area-efficient hardware implementation of decimal multiplication is preferred to slow software simulations in a number of key scientific and financial application areas, where errors caused by converting decimal numbers into their approximate binary representations are not acceptable.

Multi-digit parallel decimal multipliers involve two major stages: (i) the partial product generation (PPG) stage, where decimal partial products are determined by selecting the right versions of the pre-computed multiples of the multiplicand, followed by (ii) the partial product accumulation (PPA) stage, where all the partial products are shifted and then added together to obtain the final multiplication product. In this thesis, …


Evaluating The Presence Of A Victim Cache On An Arm Processor, Lakshmi Vidya Peri Sep 2013

Evaluating The Presence Of A Victim Cache On An Arm Processor, Lakshmi Vidya Peri

Computer Science Graduate Projects and Theses

Mobile processor is a CPU designed to save power. It is found in mobile computers and cell phones. A CPU chip, designed for portable computers, is typically housed in a smaller chip package, but more importantly, in order to run cooler, it uses lower voltages than its desktop counterpart and has more "sleep mode" capability. A mobile processor can be throttled down to different power levels and/or sections of the chip can be turned off entirely when not in use. ARM is a 32-bit reduced instruction set computer (RISC) instruction set architecture (ISA). The relative simplicity of ARM processors makes …


Nato Human View Architecture And Human Networks, Holly A. H. Handley, Nancy P. Houston Mar 2010

Nato Human View Architecture And Human Networks, Holly A. H. Handley, Nancy P. Houston

Engineering Management & Systems Engineering Faculty Publications

The NATO Human View is a system architectural viewpoint that focuses on the human as part of a system. Its purpose is to capture the human requirements and to inform on how the human impacts the system design. The viewpoint contains seven static models that include different aspects of the human element, such as roles, tasks, constraints, training and metrics. It also includes a Human Dynamics component to perform simulations of the human system under design. One of the static models, termed Human Networks, focuses on the human-to-human communication patterns that occur as a result of ad hoc or deliberate …


Research Poster: Survey Of Environmental Data Portals: Features And Characteristics, David Walker Feb 2010

Research Poster: Survey Of Environmental Data Portals: Features And Characteristics, David Walker

2010 Annual Nevada NSF EPSCoR Climate Change Conference

Research poster


Tailored Systems Architecture For Design Of Space Science And Technology Missions Using Dodaf V2.0, Nicholas J. Merski Dec 2009

Tailored Systems Architecture For Design Of Space Science And Technology Missions Using Dodaf V2.0, Nicholas J. Merski

Theses and Dissertations

The use of systems architecture, following a set of integrated descriptions from an architecture framework, has been well codified in Department of Defense acquisition and systems engineering. However, in the Space Science and Technology (S&T) community, this guidance and practice is not commonly adopted. This paper outlines an approach to leverage the changes made in DoD Architecture Framework 2.0 (DoDAF2.0), and the renewed emphasis on data and support to acquisition decision analysis. After decomposing the Space S&T design lifecycle into phases, design milestones and activities using process models, a set of DoDAF prescribed and Fit-for-Purpose views are constructed into a …


Methodology For Value-Driven Enterprise Architecture Development Goals: Application To Dodaf Framework, Justin W. Osgood Mar 2009

Methodology For Value-Driven Enterprise Architecture Development Goals: Application To Dodaf Framework, Justin W. Osgood

Theses and Dissertations

The Department of Defense Architectural Framework (DoDAF) describes 29 distinct views but offers limited guidance on view selection to meet system needs. This research extends the Value-Driven Enterprise Architecture Score (VDEA-Score) from a descriptive, evaluation protocol toward a prescriptive one by evaluating each DoDAF view and its contribution to the overall objective of the completed architecture. This extension of VDEA is referred to as VDEA-Development Goals (VDEA-DG). The program manager or other decision-makers may use this insight to justify the allocation of resources to the development of specific architecture views considered to provide maximum value. This research provides insight into …


A Hardware Framework For Yield And Reliability Enhancement In Chip Multiprocessors, Abhisek Pan Jan 2009

A Hardware Framework For Yield And Reliability Enhancement In Chip Multiprocessors, Abhisek Pan

Masters Theses 1911 - February 2014

Device reliability and manufacturability have emerged as dominant concerns in end-of-road CMOS devices. Today an increasing number of hardware failures are attributed to device reliability problems that cause partial system failure or shutdown. Also maintaining an acceptable manufacturing yield is seen as challenge because of smaller feature sizes, process variation, and reduced headroom for burn-in tests. In this project we investigate a hardware-based scheme for improving yield and reliability of a homogeneous chip multiprocessor (CMP). The proposed solution involves a hardware framework that enables us to utilize the redundancies inherent in a multi-core system to keep the system operational in …


Dynamic Task Prediction For An Spmt Architecture Based On Control Independence, Komal Jothi Jan 2009

Dynamic Task Prediction For An Spmt Architecture Based On Control Independence, Komal Jothi

Dissertations and Theses

Exploiting better performance from computer programs translates to finding more instructions to execute in parallel. Since most general purpose programs are written in an imperatively sequential manner, closely lying instructions are always data dependent, making the designer look far ahead into the program for parallelism. This necessitates wider superscalar processors with larger instruction windows. But superscalars suffer from three key limitations, their inability to scale, sequential fetch bottleneck and high branch misprediction penalty. Recent studies indicate that current superscalars have reached the end of the road and designers will have to look for newer ideas to build computer processors.

Speculative …


Exploring Hardware Based Primitives To Enhance Parallel Security Monitoring In A Novel Computing Architecture, Stephen D. Mott Mar 2007

Exploring Hardware Based Primitives To Enhance Parallel Security Monitoring In A Novel Computing Architecture, Stephen D. Mott

Theses and Dissertations

This research explores how hardware-based primitives can be implemented to perform security-related monitoring in real-time, offer better security, and increase performance compared to software-based approaches. In doing this, we propose a novel computing architecture, derived from a contemporary shared memory architecture, that facilitates efficient security-related monitoring in real-time, while keeping the monitoring hardware itself safe from attack. This architecture is flexible, allowing security to be tailored based on the needs of the system. We have developed a number of hardware-based primitives that fit into this architecture to provide a wide array of monitoring capabilities. A number of these primitives provide …


Fault And Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices, George R. Roelke Iv Sep 2006

Fault And Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices, George R. Roelke Iv

Theses and Dissertations

This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a "fault and defect tolerant" (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required …


A Reconfigurable Superscalar Architecture, Christopher B. Mayer Dec 1997

A Reconfigurable Superscalar Architecture, Christopher B. Mayer

Theses and Dissertations

The invention of the Field Programmable Gate Array (FPGA) has led to a number of interesting developments. One is the idea of providing custom hardware support for applications running on a computer. These reconfigurable computers have been shown to decrease the execution time for some applications. Based on past results, attention has subsequently turned to using reconfigurable computing in general-purpose computers (e.g. desktop and workstation environments). This thesis develops a design for just such a computer. The design, FPGADLX, is based on a hypothetical superscalar computer running the DLX instruction set and is generic enough in principle to be adapted …


A Toolkit For Specializing Production Operating System Code, Crispin Cowan, Dylan Mcnamee, Andrew P. Black, Calton Pu, Jonathan Walpole, Charles Krasic, Perry Wagle, Qian Zhang Jun 1997

A Toolkit For Specializing Production Operating System Code, Crispin Cowan, Dylan Mcnamee, Andrew P. Black, Calton Pu, Jonathan Walpole, Charles Krasic, Perry Wagle, Qian Zhang

Computer Science Faculty Publications and Presentations

Specialization has been recognized as a powerful technique for optimizing operating systems. However, specialization has not been broadly applied beyond the research community because the current techniques, based on manual specialization, are time-consuming and error-prone. This paper describes a specialization toolkit that should help broaden the applicability of specializing operating systems by assisting in the automatic generation of specialized code, and {\em guarding} the specialized code to ensure the specialized system continues to be correct. We demonstrate the effectiveness of the toolkit by describing experiences we have had applying it in real, production environments. We report on our experiences with …


Design And Implementation Of High-Radix Arithmetic Systems Based On The Sdnr/Rns Data Representation, Paul Whyte Jan 1997

Design And Implementation Of High-Radix Arithmetic Systems Based On The Sdnr/Rns Data Representation, Paul Whyte

Theses : Honours

This project involved the design and implementation of high-radix arithmetic systems based on the hybrid SDNRIRNS data representation. Some real-time applications require a real-time arithmetic system. An SDNR/RNS arithmetic system provides parallel, real-time processing. The advantages and disadvantages of high-radix SDNR/RNS arithmetic, and the feasibility of implementing SDNR/RNS arithmetic systems in CMOS VLSI technology, were investigated in this project. A common methodological model, which included the stages of analysis, design, implementation, testing, and simulation, was followed. The combination of the SDNR and RNS transforms potential complex logic networks into simpler logic blocks. It was found that when constructing a SDNRIRNS …


A Study Of Dynamic Optimization Techniques: Lessons And Directions In Kernel Design, Calton Pu, Jonathan Walpole Jan 1993

A Study Of Dynamic Optimization Techniques: Lessons And Directions In Kernel Design, Calton Pu, Jonathan Walpole

Computer Science Faculty Publications and Presentations

The Synthesis kernel [21,22,23,27,28] showed that dynamic code generation, software feedback, and fine-grain modular kernel organization are useful implementation techniques for improving the performance of operating system kernels. In addition, and perhaps more importantly, we discovered that there are strong interactions between the techniques. Hence, a careful and systematic combination of the techniques can be very powerful even though each one by itself may have serious limitations. By identifying these interactions we illustrate the problems of applying each technique in isolation to existing kernels. We also highlight the important common under-pinnings of the Synthesis experience and present our ideas on …


Parallel Architectures For Solving Combinatorial Problems Of Logic Design, Phuong Minh Ho Jan 1989

Parallel Architectures For Solving Combinatorial Problems Of Logic Design, Phuong Minh Ho

Dissertations and Theses

This thesis presents a new, practical approach to solve various NP-hard combinatorial problems of logic synthesis, logic programming, graph theory and related areas. A problem to be solved is polynomially time reduced to one of several generic combinatorial problems which can be expressed in the form of the Generalized Propositional Formula (GPF) : a Boolean product of clauses, where each clause is a sum of products of negated or non-negated literals.