Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Other Computer Engineering
Large Scale Processing And Storage Solution: Dna Safeguard Project, Eric Copp
Large Scale Processing And Storage Solution: Dna Safeguard Project, Eric Copp
Eric Copp
The DNA Safeguard project involves processing DNA sequence data in order to find nullomer sequences (non-existent short DNA sequences). While the fundamental algorithm for finding nullomer sequences is simple, it is complicated by the amount of data that must be handled. Four methods for handling terabytes of of data are investigated, single instance of a MySQL database, PVFS (Parallel Virtual File System), Hadoop, and a custom MPI (Message Passing Interface) program.