Open Access. Powered by Scholars. Published by Universities.®

Other Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Other Computer Engineering

Large Scale Processing And Storage Solution: Dna Safeguard Project, Eric Copp Oct 2012

Large Scale Processing And Storage Solution: Dna Safeguard Project, Eric Copp

Eric Copp

The DNA Safeguard project involves processing DNA sequence data in order to find nullomer sequences (non-existent short DNA sequences). While the fundamental algorithm for finding nullomer sequences is simple, it is complicated by the amount of data that must be handled. Four methods for handling terabytes of of data are investigated, single instance of a MySQL database, PVFS (Parallel Virtual File System), Hadoop, and a custom MPI (Message Passing Interface) program.


Cplop - Cal Poly's Library Of Pyroprints, Kevin Webb Dec 2011

Cplop - Cal Poly's Library Of Pyroprints, Kevin Webb

Computer Engineering

California Polytechnic Library of Pyroprints, CPLOP, is a web driven data-base application that stores data from the biology’s departments E. coli Pyrosequencing project. Some of this data was stored in Excel datasheets, while data from the pyrosequencing machines was stored as just a list of random .xml files. There was no useful way to organize and store the massive amounts of data from multiple file sources in one location, nor to perform the complicated searches and comparisons that the project requires. CPLOP’s primary goal is to store such data in three organized tables that relate to one another. It was …