Open Access. Powered by Scholars. Published by Universities.®

Computer and Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.®

Masters Theses

Numerical Analysis and Scientific Computing

Articles 1 - 1 of 1

Full-Text Articles in Computer and Systems Architecture

Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton May 2013

Programming Dense Linear Algebra Kernels On Vectorized Architectures, Jonathan Lawrence Peyton

Masters Theses

The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (GEMM) routine. This obsession is not without reason. Most, if not all, Level 3 Basic Linear Algebra Subroutines (BLAS) can be written in terms of GEMM, and many of the higher level linear algebra solvers' (i.e., LU, Cholesky) performance depend on GEMM's performance. Getting high performance on GEMM is highly architecture dependent, and so for each new architecture that comes out, GEMM has to be programmed and tested to achieve maximal performance. Also, with emergent computer architectures featuring more vector-based and multi to many-core processors, GEMM performance …