Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Computer Engineering

University of Kentucky

Theses and Dissertations--Computer Science

2013

Routing

Articles 1 - 1 of 1

Full-Text Articles in Engineering

Algorithms For Fault Tolerance In Distributed Systems And Routing In Ad Hoc Networks, Qiangfeng Jiang Jan 2013

Algorithms For Fault Tolerance In Distributed Systems And Routing In Ad Hoc Networks, Qiangfeng Jiang

Theses and Dissertations--Computer Science

Checkpointing and rollback recovery are well-known techniques for coping with failures in distributed systems. Future generation Supercomputers will be message passing distributed systems consisting of millions of processors. As the number of processors grow, failure rate also grows. Thus, designing efficient checkpointing and recovery algorithms for coping with failures in such large systems is important for these systems to be fully utilized. We presented a novel communication-induced checkpointing algorithm which helps in reducing contention for accessing stable storage to store checkpoints. Under our algorithm, a process involved in a distributed computation can independently initiate consistent global checkpointing by saving its …