Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Computer Sciences

Washington University in St. Louis

OpenCL

Articles 1 - 2 of 2

Full-Text Articles in Engineering

Domain Specific Computing In Tightly-Coupled Heterogeneous Systems, Anthony Michael Cabrera Aug 2020

Domain Specific Computing In Tightly-Coupled Heterogeneous Systems, Anthony Michael Cabrera

McKelvey School of Engineering Theses & Dissertations

Over the past several decades, researchers and programmers across many disciplines have relied on Moores law and Dennard scaling for increases in compute capability in modern processors. However, recent data suggest that the number of transistors per square inch on integrated circuits is losing pace with Moores laws projection due to the breakdown of Dennard scaling at smaller semiconductor process nodes. This has signaled the beginning of a new “golden age in computer architecture” in which the paradigm will be shifted from improving traditional processor performance for general tasks to architecting hardware that executes a class of applications in a …


Investigating Single Precision Floating General Matrix Multiply In Heterogeneous Hardware, Steven Harris Aug 2020

Investigating Single Precision Floating General Matrix Multiply In Heterogeneous Hardware, Steven Harris

McKelvey School of Engineering Theses & Dissertations

The fundamental operation of matrix multiplication is ubiquitous across a myriad of disciplines. Yet, the identification of new optimizations for matrix multiplication remains relevant for emerging hardware architectures and heterogeneous systems. Frameworks such as OpenCL enable computation orchestration on existing systems, and its availability using the Intel High Level Synthesis compiler allows users to architect new designs for reconfigurable hardware using C/C++. Using the HARPv2 as a vehicle for exploration, we investigate the utility of several of the most notable matrix multiplication optimizations to better understand the performance portability of OpenCL and the implications for such optimizations on this and …