Articles 1 - 2 of 2
Full-Text Articles in Operations Research, Systems Engineering and Industrial Engineering
Modelling Supercomputer Maintenance Interrupts: Maintenance Policy Recommendations, Jagadish Cherukuri
A supercomputer is a repairable system with large number of compute nodes interconnected to work in harmony to achieve superior computational performance. Reliability of such a complex system depends on an effective maintenance strategy that involves both emergency and preventive maintenance. This thesis analyzes the maintenance records of four supercomputers operational at The National Institute of Computational Science located at Oak Ridge National Laboratory. We propose to use the generalized proportional intensities model (GPIM) to model the maintenance interrupts as it can capture both the reliability parameters and maintenance parameters and allows the inclusion of both emergency and preventive maintenance ...
A New Stochastic Model For Systems Under General Repairs, Haitao Liao
Faculty Publications and Other Works -- Industrial & Information Engineering
Numerous stochastic models for repairable systems have been developed by assuming different time trends, and re- pair effects. In this paper, a new general repair model based on the repair history is presented. Unlike the existing models, the closed- form solutions of the reliability metrics can be derived analytically by solving a set of differential equations. Consequently, the con- fidence bounds of these metrics can be easily estimated. The pro- posed model, as well as the estimation approach, overcomes the drawbacks of the existing models. The practical use of the proposed model is demonstrated by a much-discussed set of data ...