Computer and Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.

32 Institutions 328 Full-Text Articles 362 Authors 77,757 Downloads

Recent Articles in Computer and Systems Architecture

Dynamic Translation Of Runtime Environments For Heterogeneous Computing, Rodrigo Dominguez Northeastern University

Dynamic Translation Of Runtime Environments For Heterogeneous Computing, Rodrigo Dominguez

Computer Engineering Dissertations

The recent move toward heterogeneous computer architectures calls for a global rethinking of current software and hardware paradigms. Researchers are exploring new parallel programming models, advanced compiler designs, and novel resource management techniques to exploit the features of many-core processor architectures. Graphics Processing Units (GPUs) have become the platform of choice in this area for accelerating a large range of data-parallel and task-parallel applications. The rapid adoption of GPU computing has been greatly aided by the introduction of high-level programming environments such as CUDA C and OpenCL. However, each vendor implements these programming models differently and we must analyze the ...


Hypothesis Margin Based Weighting For Feature Selection Using Boosting: Theory, Algorithms And Applications, Malak Alshawabkeh Northeastern University

Hypothesis Margin Based Weighting For Feature Selection Using Boosting: Theory, Algorithms And Applications, Malak Alshawabkeh

Computer Engineering Dissertations

Feature selection (FS) is a preprocessing process aimed at identifying a small subset of highly predictive features out of a large set of raw input variables that are possibly irrelevant or redundant. It plays a fundamental role in the success of many learning tasks where high dimensionality arisesas a big challenge. Many endeavors to cope with this problem have been attempted and various outstanding feature selection methods have been proposed. Recently, there has been a growing line of research in utilizing the concept of hypothesis margins to measure the quality of a set of features. However, most previous feature selection ...


An Fptas For Total Weighted Earliness Tardiness Problem With Constant Number Of Distinct Due Dates And Polynomially Related Weights, Jingjing Huang McMaster University

An Fptas For Total Weighted Earliness Tardiness Problem With Constant Number Of Distinct Due Dates And Polynomially Related Weights, Jingjing Huang

Open Access Dissertations and Theses

We are given a sequence of jobs on a single machine, and each job has a weight, processing time and a due date. A job is early when it finishes before or on its due date and its earliness is the amount of time between its completion time and its due date. A job is tardy when it finishes after its due date and its tardiness is the amount of time between its due date and its completion time. The TWET problem is to find a schedule which minimizes the total weighted earliness and tardiness. We are focusing on the ...


A Comprehensive Hdl Model Of A Line Associative Register Based Architecture, Matthew A. Sparks University of Kentucky

A Comprehensive Hdl Model Of A Line Associative Register Based Architecture, Matthew A. Sparks

Theses and Dissertations--Electrical and Computer Engineering

Modern processor architectures suffer from an ever increasing gap between processor and memory performance. The current memory-register model attempts to hide this gap by a system of cache memory. Line Associative Registers(LARs) are proposed as a new system to avoid the memory gap by pre-fetching and associative updating of both instructions and data. This thesis presents a fully LAR-based architecture, targeting a previously developed instruction set architecture. This architecture features an execution pipeline supporting SWAR operations, and a memory system supporting the associative behavior of LARs and lazy writeback to memory.


Power-Efficient And Low-Latency Memory Access For Cmp Systems With Heterogeneous Scratchpad On-Chip Memory, Zhi Chen University of Kentucky

Power-Efficient And Low-Latency Memory Access For Cmp Systems With Heterogeneous Scratchpad On-Chip Memory, Zhi Chen

Theses and Dissertations--Electrical and Computer Engineering

The gradually widening speed disparity of between CPU and memory has become an overwhelming bottleneck for the development of Chip Multiprocessor (CMP) systems. In addition, increasing penalties caused by frequent on-chip memory accesses have raised critical challenges in delivering high memory access performance with tight power and latency budgets. To overcome the daunting memory wall and energy wall issues, this thesis focuses on proposing a new heterogeneous scratchpad memory architecture which is configured from SRAM, MRAM, and Z-RAM. Based on this architecture, we propose two algorithms, a dynamic programming and a genetic algorithm, to perform data allocation to different memory ...


Fpga-Based Implementation Of Dual-Frequency Pattern Scheme For 3-D Shape Measurement, Brent Bondehagen University of Kentucky

Fpga-Based Implementation Of Dual-Frequency Pattern Scheme For 3-D Shape Measurement, Brent Bondehagen

Theses and Dissertations--Electrical and Computer Engineering

Structured Light Illumination (SLI) is the process where spatially varied patterns are projected onto a 3-D surface and based on the distortion by the surface topology, phase information can be calculated and a 3D model constructed. Phase Measuring Profilometry (PMP) is a particular type of SLI that requires three or more patterns temporarily multiplexed. High speed PMP attempts to scan moving objects whose motion is small so as to have little impact on the 3-D model. Given that practically all machine vision cameras and high speed cameras employ a Field Programmable Gate Array (FPGA) interface directly to the image sensors ...


Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim YarKhan University of Tennessee, Knoxville

Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan

Doctoral Dissertations

Multicore architectures with high core counts have come to dominate the world of high performance computing, from shared memory machines to the largest distributed memory clusters. The multicore route to increased performance has a simpler design and better power efficiency than the traditional approach of increasing processor frequencies. But, standard programming techniques are not well adapted to this change in computer architecture design.

In this work, we study the use of dynamic runtime environments executing data driven applications as a solution to programming multicore architectures. The goals of our runtime environments are productivity, scalability and performance. We demonstrate productivity by ...


Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber University of Tennessee, Knoxville

Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber

Doctoral Dissertations

In recent years, Graphics Processing Units (GPUs) have piqued the interest of researchers in scientific computing. Their immense floating point throughput and massive parallelism make them ideal for not just graphical applications, but many general algorithms as well. Load balancing applications and taking advantage of all computational resources in a machine is a difficult challenge, especially when the resources are heterogeneous. This dissertation presents the clUtil library, which vastly simplifies developing OpenCL applications for heterogeneous systems. The core focus of this dissertation lies in clUtil's ParallelFor construct and our novel PINA scheduler which can efficiently load balance work onto ...


Kernel-Assisted And Topology-Aware Mpi Collective Communication Among Multicore Or Many-Core Clusters, Teng Ma University of Tennessee, Knoxville

Kernel-Assisted And Topology-Aware Mpi Collective Communication Among Multicore Or Many-Core Clusters, Teng Ma

Doctoral Dissertations

Multicore or many-core clusters have become the most prominent form of High Performance Computing (HPC) systems. Hardware complexity and hierarchies not only exist in the inter-node layer, i.e., hierarchical networks, but also exist in internals of multicore compute nodes, e.g., Non Uniform Memory Accesses (NUMA), network-style interconnect, and memory and shared cache hierarchies.

Message Passing Interface (MPI), the most widely adopted in the HPC communities, suffers from decreased performance and portability due to increased hardware complexity of multiple levels. We identified three critical issues specific to collective communication: The first problem arises from the gap between logical collective ...


Exploring Computational Chemistry On Emerging Architectures, David Dewayne Jenkins University of Tennessee, Knoxville

Exploring Computational Chemistry On Emerging Architectures, David Dewayne Jenkins

Doctoral Dissertations

Emerging architectures, such as next generation microprocessors, graphics processing units, and Intel MIC cards, are being used with increased popularity in high performance computing. Each of these architectures has advantages over previous generations of architectures including performance, programmability, and power efficiency. With the ever-increasing performance of these architectures, scientific computing applications are able to attack larger, more complicated problems. However, since applications perform differently on each of the architectures, it is difficult to determine the best tool for the job. This dissertation makes the following contributions to computer engineering and computational science. First, this work implements the computational chemistry variational ...


Implementing A Matlab Based Attitude Determination Algorithm In C Within The Polysat Software Architecture, Dominic Bertolino California Polytechnic State University

Implementing A Matlab Based Attitude Determination Algorithm In C Within The Polysat Software Architecture, Dominic Bertolino

Computer Engineering

This project focuses on one component within a complete attitude determination and control system (ADCS) for a small satellite. The component consists of porting the algorithm that determines the current attitude of the satellite developed by AERO students / team members. The original algorithm has been developed in MATLAB code. The actual algorithm will be simulated and tested in MATLAB by the AEROs. The porting consisted of integrating the pieces into the custom PolySat software environment in C. Testing was done to verify the ported component corresponded to the original MATLAB component as well as verify its runtime on the PolySat ...


Decentralized Resource Scheduling In Grid/Cloud Computing, Ra'afat O. Abu-Rukba Western University

Decentralized Resource Scheduling In Grid/Cloud Computing, Ra'afat O. Abu-Rukba

University of Western Ontario - Electronic Thesis and Dissertation Repository

In the Grid/Cloud environment, applications or services and resources belong to different organizations with different objectives. Entities in the Grid/Cloud are autonomous and self-interested; however, they are willing to share their resources and services to achieve their individual and collective goals. In such open environment, the scheduling decision is a challenge given the decentralized nature of the environment. Each entity has specific requirements and objectives that need to achieve. In this thesis, we review the Grid/Cloud computing technologies, environment characteristics and structure and indicate the challenges within the resource scheduling. We capture the Grid/Cloud scheduling model ...


Driver Lane Change Intention Recognition By Using Entropy-Based Fusion Techniques And Support Vector Machine Learning Strategy, Xianyi Huang Northeastern University

Driver Lane Change Intention Recognition By Using Entropy-Based Fusion Techniques And Support Vector Machine Learning Strategy, Xianyi Huang

Mechanical Engineering Master's Theses

In this Thesis, we focus on the analysis of driver lane-changing behavior based on the fact that lane changing is a ubiquitous driving maneuver in common driving environments and regarded as the most critical driving intention. Therefore, lane changing as a case study for driving intention recognition is introduced in this study. Our methodology is to employ machine learning method i.e., support vector machine, to the classification of driving intentions using vehicle performance data and driver eye gaze data from the measurement of the driving tasks (i.e., lane following and lane changing) in the well-designed simulation environment. To ...


Learning From Imperfect And Related Labels, Yan Yan Northeastern University

Learning From Imperfect And Related Labels, Yan Yan

Computer Engineering Dissertations

Supervised Learning means there is a teacher providing labels or target information given data samples, and the goal is to predict the labels of new or unseen instances. In general, these teachers/labelers may make mistakes. In this thesis, we discuss two supervised learning "imperfect label" scenarios in: (1) multi-label classification and (2) multiple-annotator learning.

Unlike standard classification problems, in multi-label classification, a sample can be assigned to more than one class. Typically, multi-label classifiers are built from multiple binary classifiers one for each class independently. Here, we propose to take advantage of the information that can be derived from ...


Personal Smart Assistant For Digital Media And Advertisement, Ali Hussain Western University

Personal Smart Assistant For Digital Media And Advertisement, Ali Hussain

University of Western Ontario - Electronic Thesis and Dissertation Repository

The expansion of the cyberspace and the enormous process in computing and software applications enabled technology to cover every aspect of our life, therefore, many of our goals are now technology driven. Consequently, the need of intelligent assistance to achieve these goals has increased. However, for this assistance to be beneficial for users, it should be targeted to them based on their needs and preferences. Intelligent software agents have been recognized as a promising approach for the development of user-centric, personalized, applications.

In this thesis a generic personal smart assistant agent is proposed that provides relevant assistance to the user ...


Parallel Image Processing For High Content Screening Data, TAMNUN-E- MURSALIN McMaster University

Parallel Image Processing For High Content Screening Data, Tamnun-E- Mursalin

Open Access Dissertations and Theses

High-content screening (HCS) produces an immense amount of data, often on the scale of Terabytes. This requires considerable processing power resulting in long analysis time. As a result, HCS with a single-core processor system is an inefficient option because it takes a huge amount of time, storage and processing power. The situation is even worse because most of the image processing software is developed in high-level languages which make customization, flexibility and multi-processing features very challenging. Therefore, the goal of the project is to develop a multithreading model in C language. This model will be used to extract subcellular localization ...


An Environment To Support Gpu And Multicore Programming For Rapid, High Performance, Application Deployment, James Brock Northeastern University

An Environment To Support Gpu And Multicore Programming For Rapid, High Performance, Application Deployment, James Brock

Electrical Engineering Dissertations

Homogeneous multicore processors, heterogeneous multicore processors, high performance accelerators, and other heterogeneous architectures have significant computing potential over traditional single core processors. Computer systems comprised of these specialized processing elements are increasingly common. Due to the increased complexity of these architectures, programming them has become increasingly complex and error prone. Each of these architectures have different memory systems, programming languages and development environments. This has driven the need for portable programming APIs and tools that allow developers to easily exploit all of the computational power of these platforms and effortlessly move their programs between different computing systems. To deal with ...


Amaethon - A Web Application For Farm Management And An Assessment Of Its Utility, Tyler Yero California Polytechnic State University

Amaethon - A Web Application For Farm Management And An Assessment Of Its Utility, Tyler Yero

Master's Theses and Project Reports

Amaethon is a web application that is designed for enterprise farm management. It takes a job typically performed with spreadsheets, paper, or custom software and puts it on the web. Farm administration personnel may use it to schedule farm operations and manage their resources and equipment. A survey was con- ducted to assess Amaethon’s user interface design. Participants in the survey were two groups of students and a small group of agriculture professionals. Among other results, the survey indicated that a calendar interface inside Amaethon was preferred, and statistically no less effective, than a map interface. This is despite ...


Low Cost Nuerochairs, Frankie Pike California Polytechnic State University

Low Cost Nuerochairs, Frankie Pike

Master's Theses and Project Reports

Electroencephalography (EEG) was formerly confined to clinical and research settings with the necessary hardware costing thousands of dollars. In the last five years a number of companies have produced simple electroencephalograms, priced below $300 and available direct to consumers. These have stirred the imaginations of enthusiasts and brought the prospects of "thought-controlled" devices ever closer to reality. While these new devices were largely targeted at video games and toys, active research on enabling people suffering from debilitating diseases to control wheelchairs was being pursued. A number of neurochairs have come to fruition offering a truly hands-free mobility solution, but whether ...


Spatiotemporal Capacity Management For The Last Level Caches Of Chip Multiprocessors, Dongyuan Zhan University of Nebraska - Lincoln

Spatiotemporal Capacity Management For The Last Level Caches Of Chip Multiprocessors, Dongyuan Zhan

Computer Science and Engineering: Theses, Dissertations, and Student Research

Judicious management of on-chip last-level caches (LLC) is critical to alleviating the memory wall of chip multiprocessors (CMP). Although there already exist many LLC management proposals, belonging to either the spatial or temporal dimension, they fail to capture and utilize the inherent interplays between the two dimensions in capacity management. Therefore, this dissertation is targeted at exploring and exploiting the spatiotemporal interactions in LLC capacity management to improve CMPs' performance. Based on this general idea, we address four specific research problems in the dissertation.

For the private LLC organization, prior-art proposals can improve the efficacy of inter-core cooperative caching at ...