Open Access. Powered by Scholars. Published by Universities.®
- Publication
- Publication Type
Articles 1 - 2 of 2
Full-Text Articles in Other Computer Engineering
Large Scale Processing And Storage Solution: Dna Safeguard Project, Eric Copp
Large Scale Processing And Storage Solution: Dna Safeguard Project, Eric Copp
Eric Copp
The DNA Safeguard project involves processing DNA sequence data in order to find nullomer sequences (non-existent short DNA sequences). While the fundamental algorithm for finding nullomer sequences is simple, it is complicated by the amount of data that must be handled. Four methods for handling terabytes of of data are investigated, single instance of a MySQL database, PVFS (Parallel Virtual File System), Hadoop, and a custom MPI (Message Passing Interface) program.
Cplop - Cal Poly's Library Of Pyroprints, Kevin Webb
Cplop - Cal Poly's Library Of Pyroprints, Kevin Webb
Computer Engineering
California Polytechnic Library of Pyroprints, CPLOP, is a web driven data-base application that stores data from the biology’s departments E. coli Pyrosequencing project. Some of this data was stored in Excel datasheets, while data from the pyrosequencing machines was stored as just a list of random .xml files. There was no useful way to organize and store the massive amounts of data from multiple file sources in one location, nor to perform the complicated searches and comparisons that the project requires. CPLOP’s primary goal is to store such data in three organized tables that relate to one another. It was …