Open Access. Powered by Scholars. Published by Universities.®
Programming Languages and Compilers Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- HPC (2)
- Cholesky factorization (1)
- Data redistribution (1)
- Dynamic runtime system (1)
- GPU (1)
-
- GPUs (1)
- Heterogeneous computing (1)
- Heterogeneous system (1)
- Linear algebra (1)
- Loop optimization (1)
- Low-rank approximations (1)
- MPI (1)
- Mixed-precision (1)
- Multi-core (1)
- PaRSEC (1)
- Parallel (1)
- Parallel for (1)
- Programming Model (1)
- Task Runtime (1)
- Task based runtime system (1)
- Task-based programming model (1)
Articles 1 - 4 of 4
Full-Text Articles in Programming Languages and Compilers
Evaluation Of Distributed Programming Models And Extensions To Task-Based Runtime Systems, Yu Pei
Evaluation Of Distributed Programming Models And Extensions To Task-Based Runtime Systems, Yu Pei
Doctoral Dissertations
High Performance Computing (HPC) has always been a key foundation for scientific simulation and discovery. And more recently, deep learning models' training have further accelerated the demand of computational power and lower precision arithmetic. In this era following the end of Dennard's Scaling and when Moore's Law seemingly still holds true to a lesser extent, it is not a coincidence that HPC systems are equipped with multi-cores CPUs and a variety of hardware accelerators that are all massively parallel. Coupling this with interconnect networks' speed improvements lagging behind those of computational power increases, the current state of HPC systems is …
Task-Based Runtime Optimizations Towards High Performance Computing Applications, Qinglei Cao
Task-Based Runtime Optimizations Towards High Performance Computing Applications, Qinglei Cao
Doctoral Dissertations
The last decades have witnessed a rapid improvement of computational capabilities in high-performance computing (HPC) platforms thanks to hardware technology scaling. HPC architectures benefit from mainstream advances on the hardware with many-core systems, deep hierarchical memory subsystem, non-uniform memory access, and an ever-increasing gap between computational power and memory bandwidth. This has necessitated continuous adaptations across the software stack to maintain high hardware utilization. In this HPC landscape of potentially million-way parallelism, task-based programming models associated with dynamic runtime systems are becoming more popular, which fosters developers’ productivity at extreme scale by abstracting the underlying hardware complexity.
In this context, …
Programming Models' Support For Heterogeneous Architecture, Wei Wu
Programming Models' Support For Heterogeneous Architecture, Wei Wu
Doctoral Dissertations
Accelerator-enhanced computing platforms have drawn a lot of attention due to their massive peak computational capacity. Heterogeneous systems equipped with accelerators such as GPUs have become the most prominent components of High Performance Computing (HPC) systems. Even at the node level the significant heterogeneity of CPU and GPU, i.e. hardware and memory space differences, leads to challenges for fully exploiting such complex architectures. Extending outside the node scope, only escalate such challenges.
Conventional programming models such as data- ow and message passing have been widely adopted in HPC communities. When moving towards heterogeneous systems, the lack of GPU integration causes …
Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber
Parallel For Loops On Heterogeneous Resources, Frederick Edward Weber
Doctoral Dissertations
In recent years, Graphics Processing Units (GPUs) have piqued the interest of researchers in scientific computing. Their immense floating point throughput and massive parallelism make them ideal for not just graphical applications, but many general algorithms as well. Load balancing applications and taking advantage of all computational resources in a machine is a difficult challenge, especially when the resources are heterogeneous. This dissertation presents the clUtil library, which vastly simplifies developing OpenCL applications for heterogeneous systems. The core focus of this dissertation lies in clUtil's ParallelFor construct and our novel PINA scheduler which can efficiently load balance work onto multiple …