Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- CEFT (2)
- Cluster computing (2)
- PVFS (2)
- Algorithm (1)
- Bloom filter (1)
-
- Cluster file systems (1)
- Data storage (1)
- Distributed file systems (1)
- File system management (1)
- File systems management (1)
- Large-scale File Systems (1)
- Linux clusters (1)
- Markov process (1)
- Memory management (1)
- Metadata (1)
- Metadata management (1)
- Namespace Management (1)
- Operating systems (1)
- Parallel I/O (1)
- Prefetch (1)
- RAID (1)
- Redundancy (1)
- Reliability analysis (1)
- Replacement algorithms (1)
- Semantic Sensitivity (1)
- Storage (1)
- File Type
Articles 1 - 10 of 10
Full-Text Articles in Physical Sciences and Mathematics
Ceft: A Cost-Effective, Fault-Tolerant Parallel Virtual File System, Yifeng Zhu, Hong Jiang
Ceft: A Cost-Effective, Fault-Tolerant Parallel Virtual File System, Yifeng Zhu, Hong Jiang
Yifeng Zhu
The vulnerability of computer nodes due to component failures is a critical issue for cluster-based file systems. This paper studies the development and deployment of mirroring in cluster-based parallel virtual file systems to provide fault tolerance and analyzes the tradeoffs between the performance and the reliability in the mirroring scheme. It presents the design and implementation of CEFT, a scalable RAID-10 style file system based on PVFS, and proposes four novel mirroring protocols depending on whether the mirroring operations are server-driven or client-driven, whether they are asynchronous or synchronous. The comparisons of their write performances, measured in a real cluster, …
A Novel Weighted-Graph-Based Grouping Algorithm For Metadata Prefetching, Peng Gu, Jun Wang, Yifeng Zhu, Hong Jiang, Pengju Shang
A Novel Weighted-Graph-Based Grouping Algorithm For Metadata Prefetching, Peng Gu, Jun Wang, Yifeng Zhu, Hong Jiang, Pengju Shang
Yifeng Zhu
Although data prefetching algorithms have been extensively studied for years, there is no counterpart research done for metadata access performance. Existing data prefetching algorithms, either lack of emphasis on group prefetching, or bearing a high level of computational complexity, do not work well with metadata prefetching cases. Therefore, an efficient, accurate, and distributed metadata-oriented prefetching scheme is critical to leverage the overall performance in large distributed storage systems. In this paper, we present a novel weighted-graph-based prefetching technique, built on both direct and indirect successor relationship, to reap performance benefit from prefetching specifically for clustered metadata servers, an arrangement envisioned …
Race: A Robust Adaptive Caching Strategy For Buffer Cache, Yifeng Zhu, Hong Jiang
Race: A Robust Adaptive Caching Strategy For Buffer Cache, Yifeng Zhu, Hong Jiang
Yifeng Zhu
While many block replacement algorithms for buffer caches have been proposed to address the well-known drawbacks of the LRU algorithm, they are not robust and cannot maintain an consistent performance improvement over all workloads. This paper proposes a novel and simple replacement scheme, called RACE (Robust Adaptive buffer Cache management schemE), which differentiates the locality of I/O streams by actively detecting access patterns inherently exhibited in two correlated spaces: the discrete block space of program contexts from which I/O requests are issued and the continuous block space within files to which I/O requests are addressed. This scheme combines global I/O …
Rapport: Semantic-Sensitive Namespace Management In Large-Scale File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng
Rapport: Semantic-Sensitive Namespace Management In Large-Scale File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng
Yifeng Zhu
Explosive growth in volume and complexity of data exacerbates the key challenge to effectively and efficiently manage data in a way that fundamentally improves the ease and efficacy of their use. Existing large-scale file systems rely on hierarchically structured namespace that leads to severe performance bottlenecks and renders it impossible to support real-time queries on multi-dimensional attributes. This paper proposes a novel semantic-sensitive scheme, called Rapport, to provide dynamic and adaptive namespace management and support complex queries. The basic idea is to build files’ namespace by utilizing their semantic correlation and exploiting dynamic evolution of attributes to support namespace management. …
Exploiting Redundancy To Boost Performance In A Raid-10 Style Cluster-Based File System, Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, David Swanson
Exploiting Redundancy To Boost Performance In A Raid-10 Style Cluster-Based File System, Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, David Swanson
Yifeng Zhu
While aggregating the throughput of existing disks on cluster nodes is a cost-effective approach to alleviate the I/O bottleneck in cluster computing, this approach suffers from potential performance degradations due to contentions for shared resources on the same node between storage data processing and user task computation. This paper proposes to judiciously utilize the storage redundancy in the form of mirroring existed in a RAID-10 style file system to alleviate this performance degradation. More specifically, a heuristic scheduling algorithm is developed, motivated from the observations of a simple cluster configuration, to spatially schedule write operations on the nodes with less …
Smartstore: A New Metadata Organization Paradigm With Metadata Semantic-Awareness For Next-Generation File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, Lei Tian
Smartstore: A New Metadata Organization Paradigm With Metadata Semantic-Awareness For Next-Generation File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, Lei Tian
Yifeng Zhu
Existing data storage systems based on hierarchical directory tree do not meet scalability and functionality requirements for exponentially growing datasets and increasingly complex metadata queries in large-scale file systems with billions of files and Exabytes of data. This paper proposes a novel decentralized semantic-aware metadata organization, called SmartStore, which exploits metadata semantics of files to judiciously aggregate correlated files into semantic-aware groups by using information retrieval tools. The decentralized design of SmartStore can improve system scalability and reduce query latency for both complex queries (including range and top-k queries), which is helpful to construct semantic-aware caching, and conventional filename-based point …
Scalable And Adaptive Metadata Management In Ultra Large-Scale File Systems, Yu Hua, Yifeng Zhu, Hong Jiang
Scalable And Adaptive Metadata Management In Ultra Large-Scale File Systems, Yu Hua, Yifeng Zhu, Hong Jiang
Yifeng Zhu
This paper presents a scalable and adaptive decentralized metadata lookup scheme for ultra large-scale file systems (≥ Petabytes or even Exabytes). Our scheme logically organizes metadata servers (MDS) into a multi-layered query hierarchy and exploits grouped Bloom filters to efficiently route metadata requests to desired MDSs through the hierarchy. This metadata lookup scheme can be executed at the network or memory speed, without being bounded by the performance of slow disks. An effective workload balance algorithm is also developed in this paper for server reconfigurations. This scheme is evaluated through extensive trace-driven simulations and prototype implementation in Linux. Experimental results …
Amp: An Affinity-Based Metadata Prefetching Scheme In Large-Scale Distributed Storage Systems, Lin Li, Xuemin Li, Hong Jiang, Yifeng Zhu
Amp: An Affinity-Based Metadata Prefetching Scheme In Large-Scale Distributed Storage Systems, Lin Li, Xuemin Li, Hong Jiang, Yifeng Zhu
Yifeng Zhu
Prefetching is an effective technique for improving file access performance, which can reduce access latency for I/O systems. In distributed storage system, prefetching for metadata files is critical for the overall system performance. In this paper, an Affinity-based Metadata Prefetching (APM) scheme is proposed for metadata servers in large-scale distributed storage systems to provide aggressive metadata prefetching. Through mining useful information about metadata assesses from past history, AMP can discover metadata file affinities accurately and intelligently for prefetching. Compared with LRU and some of the latest file prefetching algorithms such as NEXUS and C-miner, trace-driven simulations show that AMP can …
Hba: Distributed Metadata Management For Large Cluster-Based Storage Systems, Yifeng Zhu, Hong Jiang, Jun Wang, Feng Xian
Hba: Distributed Metadata Management For Large Cluster-Based Storage Systems, Yifeng Zhu, Hong Jiang, Jun Wang, Feng Xian
Yifeng Zhu
An efficient and distributed scheme for file mapping or file lookup is critical in decentralizing metadata management within a group of metadata servers. This paper presents a novel technique called Hierarchical Bloom Filter Arrays (HBA) to map filenames to the metadata servers holding their metadata. Two levels of probabilistic arrays, namely, the Bloom filter arrays with different levels of accuracies, are used on each metadata server. One array, with lower accuracy and representing the distribution of the entire metadata, trades accuracy for significantly reduced memory overhead, whereas the other array, with higher accuracy, caches partial distribution information and exploits the …
Smartstore: A New Metadata Organization Paradigm With Semantic-Awareness For Next-Generation File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, Lei Tian
Smartstore: A New Metadata Organization Paradigm With Semantic-Awareness For Next-Generation File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, Lei Tian
Yifeng Zhu
Existing storage systems using hierarchical directory tree do not meet scalability and functionality requirements for exponentially growing datasets and increasingly complex queries in Exabyte-level systems with billions of files. This paper proposes semantic-aware organization, called SmartStore, which exploits metadata semantics of files to judiciously aggregate correlated files into semantic-aware groups by using information retrieval tools. Decentralized design improves system scalability and reduces query latency for complex queries (range and top-k queries), which is conducive to constructing semantic-aware caching, and conventional filename-based query. SmartStore limits search scope of complex query to a single or a minimal number of semantically related groups …