Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Bioinformatics

Articles 1 - 2 of 2

Full-Text Articles in Genetics and Genomics

An Investigation Of Information Structures In Dna, Joel Mohrmann May 2024

An Investigation Of Information Structures In Dna, Joel Mohrmann

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

The information-containing nature of the DNA molecule has been long known and observed. One technique for quantifying the relationships existing within the information contained in DNA sequences is an entity from information theory known as the average mutual information (AMI) profile. This investigation sought to use principally the AMI profile along with a few other metrics to explore the structure of the information contained in DNA sequences.

Treating DNA sequences as an information source, several computational methods were employed to model their information structure. Maximum likelihood and maximum a posteriori estimators were used to predict missing bases in DNA sequences. …


Classification Of Genomic Sequences By Latent Semantic Analysis, Samuel F. Way Aug 2012

Classification Of Genomic Sequences By Latent Semantic Analysis, Samuel F. Way

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Evolutionary distance measures provide a means of identifying and organizing related organisms by comparing their genomic sequences. As such, techniques that quantify the level of similarity between DNA sequences are essential in our efforts to decipher the genetic code in which they are written.

Traditional methods for estimating the evolutionary distance separating two genomic sequences often require that the sequences first be aligned before they are compared. Unfortunately, this preliminary step imposes great computational burden, making this class of techniques impractical for applications involving a large number of sequences. Instead, we desire new methods for differentiating genomic sequences that eliminate …