Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Engineering
Scalable Data Structure To Compress Next-Generation Sequencing Files And Its Application To Compressive Genomics, Sandino Vargas-Perez, Fahad Saeed
Scalable Data Structure To Compress Next-Generation Sequencing Files And Its Application To Compressive Genomics, Sandino Vargas-Perez, Fahad Saeed
Parallel Computing and Data Science Lab Technical Reports
It is now possible to compress and decompress large-scale Next-Generation Sequencing files taking advantage of high-performance computing techniques. To this end, we have recently introduced a scalable hybrid parallel algorithm, called phyNGSC, which allows fast compression as well as decompression of big FASTQ datasets using distributed and shared memory programming models via MPI and OpenMP. In this paper we present the design and implementation of a novel parallel data structure which lessens the dependency on decompression and facilitates the handling of DNA sequences in their compressed state using fine-grained decompression in a technique that is identified as in …