Open Access. Powered by Scholars. Published by Universities.®

Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Systems Architecture

Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li Aug 2023

Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li

Doctoral Dissertations

In the realm of distributed computing, collective operations involve coordinated communication and synchronization among multiple processing units, enabling efficient data exchange and collaboration. Scientific applications, such as simulations, computational fluid dynamics, and scalable deep learning, require complex computations that can be parallelized across multiple nodes in a distributed system. These applications often involve data-dependent communication patterns, where collective operations are critical for achieving high performance in data exchange. Optimizing collective operations for scientific applications and deep learning involves improving the algorithms, communication patterns, and data distribution strategies to minimize communication overhead and maximize computational efficiency.

Within the context of this …


Adaptive Parallelism For Coupled, Multithreaded Message-Passing Programs, Samuel K. Gutiérrez Dec 2018

Adaptive Parallelism For Coupled, Multithreaded Message-Passing Programs, Samuel K. Gutiérrez

Computer Science ETDs

Hybrid parallel programming models that combine message passing (MP) and shared- memory multithreading (MT) are becoming more popular, especially with applications requiring higher degrees of parallelism and scalability. Consequently, coupled parallel programs, those built via the integration of independently developed and optimized software libraries linked into a single application, increasingly comprise message-passing libraries with differing preferred degrees of threading, resulting in thread-level heterogeneity. Retroactively matching threading levels between independently developed and maintained libraries is difficult, and the challenge is exacerbated because contemporary middleware services provide only static scheduling policies over entire program executions, necessitating suboptimal, over-subscribed or under-subscribed, configurations. In …


Programming Models' Support For Heterogeneous Architecture, Wei Wu May 2017

Programming Models' Support For Heterogeneous Architecture, Wei Wu

Doctoral Dissertations

Accelerator-enhanced computing platforms have drawn a lot of attention due to their massive peak computational capacity. Heterogeneous systems equipped with accelerators such as GPUs have become the most prominent components of High Performance Computing (HPC) systems. Even at the node level the significant heterogeneity of CPU and GPU, i.e. hardware and memory space differences, leads to challenges for fully exploiting such complex architectures. Extending outside the node scope, only escalate such challenges.

Conventional programming models such as data- ow and message passing have been widely adopted in HPC communities. When moving towards heterogeneous systems, the lack of GPU integration causes …