Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms

PDF

University of Massachusetts Amherst

Mathematics and Statistics Department Faculty Publication Series

Articles 1 - 1 of 1

Full-Text Articles in Entire DC Network

Gemini: A Computationally-Efficient Search Engine For Large Gene Expression Datasets, Timothy Defreitas, Hachem Saddiki, Patrick Flaherty Jan 2016

Gemini: A Computationally-Efficient Search Engine For Large Gene Expression Datasets, Timothy Defreitas, Hachem Saddiki, Patrick Flaherty

Mathematics and Statistics Department Faculty Publication Series

Background

Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query – a text-based string – is mismatched with the form of the target – a genomic profile.

Results

To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI …