Open Access. Powered by Scholars. Published by Universities.®

Computer and Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Computer and Systems Architecture

Automated Program Profiling And Analysis For Managing Heterogeneous Memory Systems, Adam Palmer Howard Dec 2017

Automated Program Profiling And Analysis For Managing Heterogeneous Memory Systems, Adam Palmer Howard

Masters Theses

Many promising memory technologies, such as non-volatile, storage-class memories and high-bandwidth, on-chip RAMs, are beginning to emerge. Since each of these new technologies present tradeoffs distinct from conventional DRAMs, next-generation systems are likely to include multiple tiers of memory storage, each with their own type of devices. To efficiently utilize the available hardware, such systems will need to alter their data management strategies to consider the performance and capabilities provided by each tier.

This work explores a variety of cross-layer strategies for managing application data in heterogeneous memory systems. We propose different program profiling-based techniques to automatically partition program allocation …


Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton May 2013

Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton

Masters Theses

The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (GEMM) routine. This obsession is not without reason. Most, if not all, Level 3 Basic Linear Algebra Subroutines (BLAS) can be written in terms of GEMM, and many of the higher level linear algebra solvers' (i.e., LU, Cholesky) performance depend on GEMM's performance. Getting high performance on GEMM is highly architecture dependent, and so for each new architecture that comes out, GEMM has to be programmed and tested to achieve maximal performance. Also, with emergent computer architectures featuring more vector-based and multi to many-core processors, GEMM performance …