Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Purdue University

Open Access Theses

2016

High performance computing

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

A Small-Scale Testbed For Large-Scale Reliable Computing, Jason R. St. John Dec 2016

A Small-Scale Testbed For Large-Scale Reliable Computing, Jason R. St. John

Open Access Theses

High performance computing (HPC) systems frequently suffer errors and failures from hardware components that negatively impact the performance of jobs run on these systems. We analyzed system logs from two HPC systems at Purdue University and created statistical models for memory and hard disk errors. We created a small-scale error injection testbed—using a customized QEMU build, libvirt, and Python—for HPC application programmers to test and debug their programs in a faulty environment so that programmers can write more robust and resilient programs before deploying them on an actual HPC system. The deliverables for this project are the fault injection program, …


User-Centric Workload Analytics: Towards Better Cluster Management, Suhas Raveesh Javagal Apr 2016

User-Centric Workload Analytics: Towards Better Cluster Management, Suhas Raveesh Javagal

Open Access Theses

Effective management of computing clusters and providing a high quality customer support is not a trivial task. Due to rise of community clusters there is an increase in the diversity of workloads and the user demographic. Owing to this and privacy concerns of the user, it is difficult to identify performance issues, reduce resource wastage and understand implicit user demands. In this thesis, we perform in-depth analysis of user behavior, performance issues, resource usage patterns and failures in the workloads collected from a university-wide community cluster and two clusters maintained by a government lab. We also introduce a set of …