Open Access. Powered by Scholars. Published by Universities.®

Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Systems Architecture

Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li Aug 2023

Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li

Doctoral Dissertations

In the realm of distributed computing, collective operations involve coordinated communication and synchronization among multiple processing units, enabling efficient data exchange and collaboration. Scientific applications, such as simulations, computational fluid dynamics, and scalable deep learning, require complex computations that can be parallelized across multiple nodes in a distributed system. These applications often involve data-dependent communication patterns, where collective operations are critical for achieving high performance in data exchange. Optimizing collective operations for scientific applications and deep learning involves improving the algorithms, communication patterns, and data distribution strategies to minimize communication overhead and maximize computational efficiency.

Within the context of this …


Programming Models' Support For Heterogeneous Architecture, Wei Wu May 2017

Programming Models' Support For Heterogeneous Architecture, Wei Wu

Doctoral Dissertations

Accelerator-enhanced computing platforms have drawn a lot of attention due to their massive peak computational capacity. Heterogeneous systems equipped with accelerators such as GPUs have become the most prominent components of High Performance Computing (HPC) systems. Even at the node level the significant heterogeneity of CPU and GPU, i.e. hardware and memory space differences, leads to challenges for fully exploiting such complex architectures. Extending outside the node scope, only escalate such challenges.

Conventional programming models such as data- ow and message passing have been widely adopted in HPC communities. When moving towards heterogeneous systems, the lack of GPU integration causes …


Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan Dec 2012

Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan

Doctoral Dissertations

Multicore architectures with high core counts have come to dominate the world of high performance computing, from shared memory machines to the largest distributed memory clusters. The multicore route to increased performance has a simpler design and better power efficiency than the traditional approach of increasing processor frequencies. But, standard programming techniques are not well adapted to this change in computer architecture design.

In this work, we study the use of dynamic runtime environments executing data driven applications as a solution to programming multicore architectures. The goals of our runtime environments are productivity, scalability and performance. We demonstrate productivity by …