Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Entire DC Network

Data-Intensive Computing For Bioinformatics Using Virtualization Technologies And Hpc Infrastructures, Pengfei Xuan Dec 2011

Data-Intensive Computing For Bioinformatics Using Virtualization Technologies And Hpc Infrastructures, Pengfei Xuan

All Theses

The bioinformatics applications often involve many computational components and massive data sets, which are very difficult to be deployed on a single computing machine. In this thesis, we designed a data-intensive computing platform for bioinformatics applications using virtualization technologies and high performance computing (HPC) infrastructures with the concept of multi-tier architecture, which can seamlessly integrate the web user interface (presentation tier), scientific workflow (logic tier) and computing infrastructure (data/computing tier). We demonstrated our platform on two bioinformatics projects. First, we redesigned and deployed the cotton marker database (CMD) (http://www.cottonmarker.org), a centralized web portal in the cotton research community, using the …


Designing Reliable High-Performance Storage Systems For Hpc Environments, Lucas Scott Hindman Aug 2011

Designing Reliable High-Performance Storage Systems For Hpc Environments, Lucas Scott Hindman

Boise State University Theses and Dissertations

Advances in processing capability have far outpaced advances in I/O throughput and latency. Distributed file system based storage systems help to address this performance discrepancy in high performance computing (HPC) environments; however, they can be difficult to deploy and challenging to maintain. This thesis explores the design considerations as well as the pitfalls faced when deploying high performance storage systems. It includes best practices in identifying system requirements, techniques for generating I/O profiles of applications, and recommendations for disk subsystem configuration and maintenance based upon a number of recent papers addressing latent sector and unrecoverable read errors.


Efficient Replica-Exchange Across Distributed Production Infrastructure, Abhinav S. Thota Jan 2011

Efficient Replica-Exchange Across Distributed Production Infrastructure, Abhinav S. Thota

LSU Master's Theses

Replica-Exchange (RE) methods represent a class of algorithms that involve a large number of loosely-coupled ensembles and are used to understand physical phenomena -- ranging from protein folding dynamics to binding affinity calculations. We develop a framework for RE that supports different replica pairing and coordination mechanisms, that can use a wide range of production cyberinfrastructure concurrently. Additionally, our framework uses a flexible pilot-job implementation, which enables effective resource allocation for multiple replicas. We characterize the performance of two different RE algorithms - synchronous and asynchronous - at unprecedented scales on production distributed infrastructure (Teragrid and LONI). The synchronous RE …