Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2006

Computer Sciences

University of Nebraska - Lincoln

CEFT

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Ceft: A Cost-Effective, Fault-Tolerant Parallel Virtual File System, Yifeng Zhu, Hong Jiang Feb 2006

Ceft: A Cost-Effective, Fault-Tolerant Parallel Virtual File System, Yifeng Zhu, Hong Jiang

School of Computing: Faculty Publications

The vulnerability of computer nodes due to component failures is a critical issue for cluster-based file systems. This paper studies the development and deployment of mirroring in cluster-based parallel virtual file systems to provide fault tolerance and analyzes the tradeoffs between the performance and the reliability in the mirroring scheme. It presents the design and implementation of CEFT, a scalable RAID-10 style file system based on PVFS, and proposes four novel mirroring protocols depending on whether the mirroring operations are server-driven or client-driven, whether they are asynchronous or synchronous. The comparisons of their write performances, measured in a real cluster, …


Exploiting Redundancy To Boost Performance In A Raid-10 Style Cluster-Based File System, Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, David Swanson Jan 2006

Exploiting Redundancy To Boost Performance In A Raid-10 Style Cluster-Based File System, Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, David Swanson

School of Computing: Faculty Publications

While aggregating the throughput of existing disks on cluster nodes is a cost-effective approach to alleviate the I/O bottleneck in cluster computing, this approach suffers from potential performance degradations due to contentions for shared resources on the same node between storage data processing and user task computation. This paper proposes to judiciously utilize the storage redundancy in the form of mirroring existed in a RAID-10 style file system to alleviate this performance degradation. More specifically, a heuristic scheduling algorithm is developed, motivated from the observations of a simple cluster configuration, to spatially schedule write operations on the nodes with less …