Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms

Bucknell University

Honors Theses

Theses/Dissertations

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Extensions Of The Morse-Hedlund Theorem, Eben Blaisdell Jan 2018

Extensions Of The Morse-Hedlund Theorem, Eben Blaisdell

Honors Theses

Bi-infinite words are sequences of characters that are infinite forwards and backwards; for example "...ababababab...". The Morse-Hedlund theorem says that a bi-infinite word f repeats itself, in at most n letters, if and only if the number of distinct subwords of length n is at most n. Using the example, "...ababababab...", there are 2 subwords of length 3, namely "aba" and "bab". Since 2 is less than 3, we must have that "...ababababab..." repeats itself after at most 3 letters. In fact it does repeat itself every two letters. …


Utilization Of Probabilistic Models In Short Read Assembly From Second-Generation Sequencing, Matthew W. Segar May 2012

Utilization Of Probabilistic Models In Short Read Assembly From Second-Generation Sequencing, Matthew W. Segar

Honors Theses

With the advent of cheaper and faster DNA sequencing technologies, assembly methods have greatly changed. Instead of outputting reads that are thousands of base pairs long, new sequencers parallelize the task by producing read lengths between 35 and 400 base pairs. Reconstructing an organism’s genome from these millions of reads is a computationally expensive task. Our algorithm solves this problem by organizing and indexing the reads using n-grams, which are short, fixed-length DNA sequences of length n. These n-grams are used to efficiently locate putative read joins, thereby eliminating the need to perform an exhaustive search over all possible read …