Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

University of Wollongong

Faculty of Informatics - Papers (Archive)

Series

2013

Systems

Articles 1 - 1 of 1

Full-Text Articles in Entire DC Network

A Novel Approach To Data Deduplication Over The Engineering-Oriented Cloud Systems, Zhe Sun, Jun Shen, Jianming Young Jan 2013

A Novel Approach To Data Deduplication Over The Engineering-Oriented Cloud Systems, Zhe Sun, Jun Shen, Jianming Young

Faculty of Informatics - Papers (Archive)

This paper presents a duplication-less storage system over the engineering-oriented cloud computing platforms. Our deduplication storage system, which manages data and duplication over the cloud system, consists of two major components, a front-end deduplication application and a mass storage system as back-end. Hadoop distributed file system (HDFS) is a common distribution file system on the cloud, which is used with Hadoop database (HBase). We use HDFS to build up a mass storage system and employ HBase to build up a fast indexing system. With a deduplication application, a scalable and parallel deduplicated cloud storage system can be effectively built up. …