Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 31

Full-Text Articles in Computer Engineering

An Efficient Privacy-Preserving Framework For Video Analytics, Tian Zhou Mar 2024

An Efficient Privacy-Preserving Framework For Video Analytics, Tian Zhou

Doctoral Dissertations

With the proliferation of video content from surveillance cameras, social media, and live streaming services, the need for efficient video analytics has grown immensely. In recent years, machine learning based computer vision algorithms have shown great success in various video analytic tasks. Specifically, neural network models have dominated in visual tasks such as image and video classification, object recognition, object detection, and object tracking. However, compared with classic computer vision algorithms, machine learning based methods are usually much more compute-intensive. Powerful servers are required by many state-of-the-art machine learning models. With the development of cloud computing infrastructures, people are able …


Improving The Programmability Of Networked Energy Systems, Noman Bashir Jun 2022

Improving The Programmability Of Networked Energy Systems, Noman Bashir

Doctoral Dissertations

Global warming and climate change have underscored the need for designing sustainable energy systems. Sustainable energy systems, e.g., smart grids, green data centers, differ from the traditional systems in significant ways and present unique challenges to system designers and operators. First, intermittent renewable energy resources power these systems, which break the notion of infinite, reliable, and controllable power supply. Second, these systems come in varying sizes, spanning over large geographical regions. The control of these dispersed and diverse systems raises scalability challenges. Third, the performance modeling and fault detection in sustainable energy systems is still an active research area. Finally, …


Profile-Guided Data Management For Heterogeneous Memory Systems, Matthew B. Olson Dec 2021

Profile-Guided Data Management For Heterogeneous Memory Systems, Matthew B. Olson

Doctoral Dissertations

Market forces and technological constraints have led to a gap between CPU and memory performance that has widened for decades. While processor scaling has plateaued in recent years, this gap persists and is not expected to diminish for the foreseeable future. This discrepancy presents a host of challenges for scaling application performance, which have only been exacerbated in recent years, as increasing demands for fast and effective data analytics are driving memory energy, bandwidth, and capacity requirements to new heights.

To address these trends, hardware architects have introduced a plethora of memory technologies. For example, most modern memory systems include …


Toward Reliable And Efficient Message Passing Software For Hpc Systems: Fault Tolerance And Vector Extension, Dong Zhong Aug 2021

Toward Reliable And Efficient Message Passing Software For Hpc Systems: Fault Tolerance And Vector Extension, Dong Zhong

Doctoral Dissertations

As the scale of High-performance Computing (HPC) systems continues to grow, researchers are devoted themselves to achieve the best performance of running long computing jobs on these systems. My research focus on reliability and efficiency study for HPC software.

First, as systems become larger, mean-time-to-failure (MTTF) of these HPC systems is negatively impacted and tends to decrease. Handling system failures becomes a prime challenge. My research aims to present a general design and implementation of an efficient runtime-level failure detection and propagation strategy targeting large-scale, dynamic systems that is able to detect both node and process failures. Using multiple overlapping …


Human Fatigue Predictions In Complex Aviation Crew Operational Impact Conditions, Suresh Rangan May 2021

Human Fatigue Predictions In Complex Aviation Crew Operational Impact Conditions, Suresh Rangan

Doctoral Dissertations

In this last decade, several regulatory frameworks across the world in all modes of transportation had brought fatigue and its risk management in operations to the forefront. Of all transportation modes air travel has been the safest means of transportation. Still as part of continuous improvement efforts, regulators are insisting the operators to adopt strong fatigue science and its foundational principles to reinforce safety risk assessment and management. Fatigue risk management is a data driven system that finds a realistic balance between safety and productivity in an organization. This work discusses the effects of mathematical modeling of fatigue and its …


Addressing Security Challenges In Embedded Systems And Multi-Tenant Fpgas, Georgios Provelengios Apr 2021

Addressing Security Challenges In Embedded Systems And Multi-Tenant Fpgas, Georgios Provelengios

Doctoral Dissertations

Embedded systems and field-programmable gate arrays (FPGAs) have become crucial parts of the infrastructure that supports our modern technological world. Given the multitude of threats that are present, the need for secure computing systems is undeniably greater than ever. Embedded systems and FPGAs are governed by characteristics that create unique security challenges and vulnerabilities. Despite their array of uses, embedded systems are often built with modest microprocessors that do not support the conventional security solutions used by workstations, such as virus scanners. In the first part of this dissertation, a microprocessor defense mechanism that uses a hardware monitor to protect …


Design And Implementation Of Path Finding And Verification In The Internet, Hao Cai Jul 2020

Design And Implementation Of Path Finding And Verification In The Internet, Hao Cai

Doctoral Dissertations

In the Internet, network traffic between endpoints typically follows one path that is determined by the control plane. Endpoints have little control over the choice of which path their network traffic takes and little ability to verify if the traffic indeed follows a specific path. With the emergence of software-defined networking (SDN), more control over connections can be exercised, and thus the opportunity for novel solutions exists. However, there remain concerns about the attack surface exposed by fine-grained control, which may allow attackers to inject and redirect traffic. To address these opportunities and concerns, we consider two specific challenges: (1) …


Towards Optimized Traffic Provisioning And Adaptive Cache Management For Content Delivery, Aditya Sundarrajan Mar 2020

Towards Optimized Traffic Provisioning And Adaptive Cache Management For Content Delivery, Aditya Sundarrajan

Doctoral Dissertations

Content delivery networks (CDNs) deploy hundreds of thousands of servers around the world to cache and serve trillions of user requests every day for a diverse set of content such as web pages, videos, software downloads and images. In this dissertation, we propose algorithms to provision traffic across cache servers and manage the content they host to achieve performance objectives such as maximizing the cache hit rate, minimizing the bandwidth cost of the network and minimizing the energy consumption of the servers. Traffic provisioning is the process of determining the set of content domains hosted on the servers. We propose …


Trustworthy Systems And Protocols For The Internet Of Things, Arman Pouraghily Mar 2020

Trustworthy Systems And Protocols For The Internet Of Things, Arman Pouraghily

Doctoral Dissertations

Processor-based embedded systems are integrated into many aspects of everyday life such as industrial control, automotive systems, healthcare, the Internet of Things, etc. As Moore’s law progresses, these embedded systems have moved from simple microcontrollers to full-scale embedded computing systems with multiple processor cores and operating systems support. At the same time, the security of these devices has also become a key concern. Our main focus in this work is the security and privacy of the embedded systems used in IoT systems. In the first part of this work, we take a look at the security of embedded systems from …


Qoe-Aware Content Distribution Systems For Adaptive Bitrate Video Streaming, Divyashri Bhat Mar 2020

Qoe-Aware Content Distribution Systems For Adaptive Bitrate Video Streaming, Divyashri Bhat

Doctoral Dissertations

A prodigious increase in video streaming content along with a simultaneous rise in end system capabilities has led to the proliferation of adaptive bit rate video streaming users in the Internet. Today, video streaming services range from Video-on-Demand services like traditional IP TV to more recent technologies such as immersive 3D experiences for live sports events. In order to meet the demands of these services, the multimedia and networking research community continues to strive toward efficiently delivering high quality content across the Internet while also trying to minimize content storage and delivery costs. The introduction of flexible and adaptable technologies …


A Parallel Direct Method For Finite Element Electromagnetic Computations Based On Domain Decomposition, Javad Moshfegh Nov 2019

A Parallel Direct Method For Finite Element Electromagnetic Computations Based On Domain Decomposition, Javad Moshfegh

Doctoral Dissertations

High performance parallel computing and direct (factorization-based) solution methods have been the two main trends in electromagnetic computations in recent years. When time-harmonic (frequency-domain) Maxwell's equation are directly discretized with the Finite Element Method (FEM) or other Partial Differential Equation (PDE) methods, the resulting linear system of equations is sparse and indefinite, thus harder to efficiently factorize serially or in parallel than alternative methods e.g. integral equation solutions, that result in dense linear systems. State-of-the-art sparse matrix direct solvers such as MUMPS and PARDISO don't scale favorably, have low parallel efficiency and high memory footprint. This work introduces a new …


Cmos Compatible Memristor Networks For Brain-Inspired Computing, Can Li Nov 2018

Cmos Compatible Memristor Networks For Brain-Inspired Computing, Can Li

Doctoral Dissertations

In the past decades, the computing capability has shown an exponential growth trend, which is observed as Moore’s law. However, this growth speed is slowing down in recent years mostly because the down-scaled size of transistors is approaching their physical limit. On the other hand, recent advances in software, especially in big data analysis and artificial intelligence, call for a break-through in computing hardware. The memristor, or the resistive switching device, is believed to be a potential building block of the future generation of integrated circuits. The underlying mechanism of this device is different from that of complementary metal-oxide-semiconductor (CMOS) …


Transiency-Driven Resource Management For Cloud Computing Platforms, Prateek Sharma Oct 2018

Transiency-Driven Resource Management For Cloud Computing Platforms, Prateek Sharma

Doctoral Dissertations

Modern distributed server applications are hosted on enterprise or cloud data centers that provide computing, storage, and networking capabilities to these applications. These applications are built using the implicit assumption that the underlying servers will be stable and normally available, barring for occasional faults. In many emerging scenarios, however, data centers and clouds only provide transient, rather than continuous, availability of their servers. Transiency in modern distributed systems arises in many contexts, such as green data centers powered using renewable intermittent sources, and cloud platforms that provide lower-cost transient servers which can be unilaterally revoked by the cloud operator. Transient …


Hybrid Black-Box Solar Analytics And Their Privacy Implications, Dong Chen Oct 2018

Hybrid Black-Box Solar Analytics And Their Privacy Implications, Dong Chen

Doctoral Dissertations

The aggregate solar capacity in the U.S. is rising rapidly due to continuing decreases in the cost of solar modules. For example, the installed cost per Watt (W) for residential photovoltaics (PVs) decreased by 6X from 2009 to 2018 (from $8/W to $1.2/W), resulting in the installed aggregate solar capacity increasing 128X from 2009 to 2018 (from 435 megawatts to 55.9 gigawatts). This increasing solar capacity is imposing operational challenges on utilities in balancing electricity's real-time supply and demand, as solar generation is more stochastic and less predictable than aggregate demand. To address this problem, both academia and utilities have …


An Architecture Evaluation And Implementation Of A Soft Gpgpu For Fpgas, Kevin Andryc Oct 2018

An Architecture Evaluation And Implementation Of A Soft Gpgpu For Fpgas, Kevin Andryc

Doctoral Dissertations

Embedded and mobile systems must be able to execute a variety of different types of code, often with minimal available hardware. Many embedded systems now come with a simple processor and an FPGA, but not more energy-hungry components, such as a GPGPU. In this dissertation we present FlexGrip, a soft architecture which allows for the execution of GPGPU code on an FPGA without the need to recompile the design. The architecture is optimized for FPGA implementation to effectively support the conditional and thread-based execution characteristics of GPGPU execution without FPGA design recompilation. This architecture supports direct CUDA compilation to a …


Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu Nov 2017

Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu

Doctoral Dissertations

Cyber-physical systems frequently have to use massive redundancy to meet application requirements for high reliability. While such redundancy is required, it can be activated adaptively, based on the current state of the controlled plant. Most of the time the physical plant is in a state that allows for a lower level of fault-tolerance. Avoiding the continuous deployment of massive fault-tolerance will greatly reduce the workload of CPSs. In this dissertation, we demonstrate a software simulation framework (AdaFT) that can automatically generate the sub-spaces within which our adaptive fault-tolerance can be applied. We also show the theoretical benefits of AdaFT, and …


Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu Aug 2017

Wide-Area Measurement-Driven Approaches For Power System Modeling And Analytics, Hesen Liu

Doctoral Dissertations

This dissertation presents wide-area measurement-driven approaches for power system modeling and analytics. Accurate power system dynamic models are the very basis of power system analysis, control, and operation. Meanwhile, phasor measurement data provide first-hand knowledge of power system dynamic behaviors. The idea of building out innovative applications with synchrophasor data is promising.

Taking advantage of the real-time wide-area measurements, one of phasor measurements’ novel applications is to develop a synchrophasor-based auto-regressive with exogenous inputs (ARX) model that can be updated online to estimate or predict system dynamic responses.

Furthermore, since auto-regressive models are in a big family, the ARX model …


Achieving High Reliability And Efficiency In Maintaining Large-Scale Storage Systems Through Optimal Resource Provisioning And Data Placement, Lipeng Wan Aug 2016

Achieving High Reliability And Efficiency In Maintaining Large-Scale Storage Systems Through Optimal Resource Provisioning And Data Placement, Lipeng Wan

Doctoral Dissertations

With the explosive increase in the amount of data being generated by various applications, large-scale distributed and parallel storage systems have become common data storage solutions and been widely deployed and utilized in both industry and academia. While these high performance storage systems significantly accelerate the data storage and retrieval, they also bring some critical issues in system maintenance and management. In this dissertation, I propose three methodologies to address three of these critical issues.

First, I develop an optimal resource management and spare provisioning model to minimize the impact brought by component failures and ensure a highly operational experience …


Arithmetic Logic Unit Architectures With Dynamically Defined Precision, Getao Liang Dec 2015

Arithmetic Logic Unit Architectures With Dynamically Defined Precision, Getao Liang

Doctoral Dissertations

Modern central processing units (CPUs) employ arithmetic logic units (ALUs) that support statically defined precisions, often adhering to industry standards. Although CPU manufacturers highly optimize their ALUs, industry standard precisions embody accuracy and performance compromises for general purpose deployment. Hence, optimizing ALU precision holds great potential for improving speed and energy efficiency. Previous research on multiple precision ALUs focused on predefined, static precisions. Little previous work addressed ALU architectures with customized, dynamically defined precision. This dissertation presents approaches for developing dynamic precision ALU architectures for both fixed-point and floating-point to enable better performance, energy efficiency, and numeric accuracy. These new …


Skybridge: A New Nanoscale 3-D Computing Framework For Future Integrated Circuits, Mostafizur Rahman Nov 2015

Skybridge: A New Nanoscale 3-D Computing Framework For Future Integrated Circuits, Mostafizur Rahman

Doctoral Dissertations

Continuous scaling of CMOS has been the major catalyst in miniaturization of integrated circuits (ICs) and crucial for global socio-economic progress. However, continuing the traditional way of scaling to sub-20nm technologies is proving to be very difficult as MOSFETs are reaching their fundamental performance limits [1] and interconnection bottleneck is dominating IC operational power and performance [2]. Migrating to 3-D, as a way to advance scaling, has been elusive due to inherent customization and manufacturing requirements in CMOS architecture that are incompatible with 3-D organization. Partial attempts with die-die [3] and layer-layer [4] stacking have their own limitations [5]. We …


On Thermal Sensor Calibration And Software Techniques For Many-Core Thermal Management, Shiting Lu Nov 2015

On Thermal Sensor Calibration And Software Techniques For Many-Core Thermal Management, Shiting Lu

Doctoral Dissertations

The high power density of a many-core processor results in increased temperature which negatively impacts system reliability and performance. Dynamic thermal management applies thermal-aware techniques at run time to avoid overheating using temperature information collected from on-chip thermal sensors. Temperature sensing and thermal control schemes are two critical technologies for successfully maintaining thermal safety. In this dissertation, on-line thermal sensor calibration schemes are developed to provide accurate temperature information. Software-based dynamic thermal management techniques are proposed using calibrated thermal sensors. Due to process variation and silicon aging, on-chip thermal sensors require periodic calibration before use in DTM. However, the calibration …


Design And Implementation Of An Economy Plane For The Internet, Xinming Chen Nov 2015

Design And Implementation Of An Economy Plane For The Internet, Xinming Chen

Doctoral Dissertations

The Internet has been very successful in supporting many network applications. As the diversity of uses for the Internet has increased, many protocols and services have been developed by the industry and the research community. However, many of them failed to get deployed in the Internet. One challenge of deploying these novel ideas in operational network is that the network providers need to be involved in the process. Many novel network protocols and services, like multicast and end-to-end QoS, need the support from network providers. However, since network providers are typically driven by business reasons, if they can not get …


Physically Equivalent Intelligent Systems For Reasoning Under Uncertainty At Nanoscale, Santosh Khasanvis Nov 2015

Physically Equivalent Intelligent Systems For Reasoning Under Uncertainty At Nanoscale, Santosh Khasanvis

Doctoral Dissertations

Machines today lack the inherent ability to reason and make decisions, or operate in the presence of uncertainty. Machine-learning methods such as Bayesian Networks (BNs) are widely acknowledged for their ability to uncover relationships and generate causal models for complex interactions. However, their massive computational requirement, when implemented on conventional computers, hinders their usefulness in many critical problem areas e.g., genetic basis of diseases, macro finance, text classification, environment monitoring, etc. We propose a new non-von Neumann technology framework purposefully architected across all layers for solving these problems efficiently through physical equivalence, enabled by emerging nanotechnology. The architecture builds …


Reliable And Efficient Multithreading, Tongping Liu Aug 2014

Reliable And Efficient Multithreading, Tongping Liu

Doctoral Dissertations

The advent of multicore architecture has increased the demand for multithreaded programs. It is notoriously far more challenging to write parallel programs correctly and efficiently than sequential ones because of the wide range of concurrency errors and performance problems. In this thesis, I developed a series of runtime systems and tools to combat concurrency errors and performance problems of multithreaded programs. The first system, Dthreads, automatically ensures determinism for unmodified C/C++ applications using the pthreads library without requiring programmer intervention and hardware support. Dthreads greatly simplifies the understanding and debugging of multithreaded programs. Dthreads often matches or even exceeds the …


Parallel Multi-Core Verilog Hdl Simulation, Tariq B. Ahmad Aug 2014

Parallel Multi-Core Verilog Hdl Simulation, Tariq B. Ahmad

Doctoral Dissertations

In the era of multi-core computing, the push for creating true parallel applications that can run on individual CPUs is on the rise. Application of parallel discrete event simulation (PDES) to hardware design verification looks promising, given the complexity of today’s hardware designs. Unfortunately, the challenges imposed by lack of inherent parallelism, suboptimal design partitioning, synchronization and communication overhead, and load balancing, render this approach largely ineffective. This thesis presents three techniques for accelerating simulation at three levels of abstraction namely, RTL, functional gate-level (zero-delay) and gate-level timing. We review contemporary solutions and then propose new ways of speeding up …


Exploring Computational Chemistry On Emerging Architectures, David Dewayne Jenkins Dec 2012

Exploring Computational Chemistry On Emerging Architectures, David Dewayne Jenkins

Doctoral Dissertations

Emerging architectures, such as next generation microprocessors, graphics processing units, and Intel MIC cards, are being used with increased popularity in high performance computing. Each of these architectures has advantages over previous generations of architectures including performance, programmability, and power efficiency. With the ever-increasing performance of these architectures, scientific computing applications are able to attack larger, more complicated problems. However, since applications perform differently on each of the architectures, it is difficult to determine the best tool for the job. This dissertation makes the following contributions to computer engineering and computational science. First, this work implements the computational chemistry variational …


Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber Dec 2012

Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber

Doctoral Dissertations

In recent years, Graphics Processing Units (GPUs) have piqued the interest of researchers in scientific computing. Their immense floating point throughput and massive parallelism make them ideal for not just graphical applications, but many general algorithms as well. Load balancing applications and taking advantage of all computational resources in a machine is a difficult challenge, especially when the resources are heterogeneous. This dissertation presents the clUtil library, which vastly simplifies developing OpenCL applications for heterogeneous systems. The core focus of this dissertation lies in clUtil's ParallelFor construct and our novel PINA scheduler which can efficiently load balance work onto multiple …


Kernel-Assisted And Topology-Aware Mpi Collective Communication Among Multicore Or Many-Core Clusters, Teng Ma Dec 2012

Kernel-Assisted And Topology-Aware Mpi Collective Communication Among Multicore Or Many-Core Clusters, Teng Ma

Doctoral Dissertations

Multicore or many-core clusters have become the most prominent form of High Performance Computing (HPC) systems. Hardware complexity and hierarchies not only exist in the inter-node layer, i.e., hierarchical networks, but also exist in internals of multicore compute nodes, e.g., Non Uniform Memory Accesses (NUMA), network-style interconnect, and memory and shared cache hierarchies.

Message Passing Interface (MPI), the most widely adopted in the HPC communities, suffers from decreased performance and portability due to increased hardware complexity of multiple levels. We identified three critical issues specific to collective communication: The first problem arises from the gap between logical collective topologies and …


Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan Dec 2012

Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan

Doctoral Dissertations

Multicore architectures with high core counts have come to dominate the world of high performance computing, from shared memory machines to the largest distributed memory clusters. The multicore route to increased performance has a simpler design and better power efficiency than the traditional approach of increasing processor frequencies. But, standard programming techniques are not well adapted to this change in computer architecture design.

In this work, we study the use of dynamic runtime environments executing data driven applications as a solution to programming multicore architectures. The goals of our runtime environments are productivity, scalability and performance. We demonstrate productivity by …


Performance Controlled Power Optimization For Virtualized Internet Datacenters, Yefu Wang Aug 2011

Performance Controlled Power Optimization For Virtualized Internet Datacenters, Yefu Wang

Doctoral Dissertations

Modern data centers must provide performance assurance for complex system software such as web applications. In addition, the power consumption of data centers needs to be minimized to reduce operating costs and avoid system overheating. In recent years, more and more data centers start to adopt server virtualization strategies for resource sharing to reduce hardware and operating costs by consolidating applications previously running on multiple physical servers onto a single physical server. In this dissertation, several power efficient algorithms are proposed to effectively reduce server power consumption while achieving the required application-level performance for virtualized servers.

First, at the server …