Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Mathematics

Doctoral Dissertations

High-performance computing

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Information Metrics For Predictive Modeling And Machine Learning, Kostantinos Gourgoulias Jul 2017

Information Metrics For Predictive Modeling And Machine Learning, Kostantinos Gourgoulias

Doctoral Dissertations

The ever-increasing complexity of the models used in predictive modeling and data science and their use for prediction and inference has made the development of tools for uncertainty quantification and model selection especially important. In this work, we seek to understand the various trade-offs associated with the simulation of stochastic systems. Some trade-offs are computational, e.g., execution time of an algorithm versus accuracy of simulation. Others are analytical: whether or not we are able to find tractable substitutes for quantities of interest, e.g., distributions, ergodic averages, etc. The first two chapters of this thesis deal with the study of the …


Near-Optimal Scheduling And Decision-Making Models For Reactive And Proactive Fault Tolerance Mechanisms, Nichamon Naksinehaboon Apr 2012

Near-Optimal Scheduling And Decision-Making Models For Reactive And Proactive Fault Tolerance Mechanisms, Nichamon Naksinehaboon

Doctoral Dissertations

As High Performance Computing (HPC) systems increase in size to fulfill computational power demand, the chance of failure occurrences dramatically increases, resulting in potentially large amounts of lost computing time. Fault Tolerance (FT) mechanisms aim to mitigate the impact of failure occurrences to the running applications. However, the overhead of FT mechanisms increases proportionally to the HPC systems' size. Therefore, challenges arise in handling the expensive overhead of FT mechanisms while minimizing the large amount of lost computing time due to failure occurrences.

In this dissertation, a near-optimal scheduling model is built to determine when to invoke a hybrid checkpoint …