Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

University of Montana

Discipline
Keyword
Publication Year
Publication

Articles 1 - 16 of 16

Full-Text Articles in Bioinformatics

Vibes: A Workflow For Annotating And Visualizing Viral Sequences Integrated Into Bacterial Genomes, Conner J. Copeland Jan 2023

Vibes: A Workflow For Annotating And Visualizing Viral Sequences Integrated Into Bacterial Genomes, Conner J. Copeland

Graduate Student Theses, Dissertations, & Professional Papers

Bacteriophages are viruses that infect bacteria. Many bacteriophages integrate their genomes into the bacterial chromosome and become prophages. Prophages may substantially burden or benefit host bacteria fitness, acting in some cases as parasites and in others as mutualists, and have been demonstrated to increase host virulence. The increasing ease of bacterial genome se- quencing provides an opportunity to deeply explore prophage prevalence and insertion sites. Here we present VIBES, a workflow intended to automate prophage annotation in complete bacterial genome sequences. VIBES provides additional context to prophage annotations by annotating bac- terial genes and viral proteins in user-provided bacterial and …


An Integrative Investigation Of The Synechococcus A/B Clade During Adaptive Radiation At The Upper Thermal Limit Of Phototrophy, Christopher L. Pierpont Jan 2022

An Integrative Investigation Of The Synechococcus A/B Clade During Adaptive Radiation At The Upper Thermal Limit Of Phototrophy, Christopher L. Pierpont

Graduate Student Theses, Dissertations, & Professional Papers

Thermophilic microorganisms have been scientifically observed since the early nineteenth century and have spurred many questions about the limits of life and the capacity of organisms to survive extreme conditions. Decades of research on thermophile proteins and genomes have yielded several proposed correlates of temperature that may contribute to adaptation of bacteria and archaea to high temperature. However, many of the generalizations reported are drawn from analyses of deeply divergent taxa or from individual case studies in isolation from mesophilic relatives. Members of the Synechococcus A/B (SynAB) group are the only cyanobacteria with members able to grow above 65 °C …


Fathmm: Frameshift Aware Translated Hidden Markov Models, Genevieve Krause Jan 2022

Fathmm: Frameshift Aware Translated Hidden Markov Models, Genevieve Krause

Graduate Student Theses, Dissertations, & Professional Papers

No abstract provided.


Subfamily Clustering Using Label Uncertainty (For Transposable Element Families), Audrey M. Shingleton Jan 2022

Subfamily Clustering Using Label Uncertainty (For Transposable Element Families), Audrey M. Shingleton

Graduate Student Theses, Dissertations, & Professional Papers

Biological sequence annotation is typically performed by aligning a sequence to a database of known sequence elements. For transposable elements, these known sequences represent subfamily consensus sequences. When many of the subfamily models in the database are highly similar to each other, a sequence belonging to one subfamily can easily be mistaken as belonging to another, causing non-reproducible subfamily annotation. Because annotation with subfamilies is expected to give some amount of insight into a sequence’s evolutionary history, it is important that such annotation be reproducible. Here, we present our software tool, SCULU, which builds upon our previously-described methods for computing …


Sparse Forward-Backward Alignment For Sensitive Database Search With Small Memory And Time Requirements, David H. Rich Jan 2021

Sparse Forward-Backward Alignment For Sensitive Database Search With Small Memory And Time Requirements, David H. Rich

Graduate Student Theses, Dissertations, & Professional Papers

Sequence annotation is typically performed by aligning an unlabeled sequence to a collection of known sequences, with the aim of identifying non-random similarities. Given the broad diversity of new sequences and the considerable scale of modern sequence databases, there is significant tension between the competing needs for sensitivity and speed, with multiple tools displacing the venerable BLAST software suite on one axis or another. In recent years, alignment based on profile hidden Markov models (pHMMs) and associated probabilistic inference methods have demonstrated increased sensitivity due in part to consideration of the ensemble of all possible alignments between a query and …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …


Polya: A Tool For Adjudicating Competing Annotations Of Biological Sequences, Kaitlin Carey Jan 2021

Polya: A Tool For Adjudicating Competing Annotations Of Biological Sequences, Kaitlin Carey

Graduate Student Theses, Dissertations, & Professional Papers

Annotation of a biological sequence is usually performed by aligning that sequence to a database of known sequence elements. When that database contains elements that are highly similar to each other, the proper annotation may be ambiguous, because several entries in the database produce high-scoring alignments. Typical annotation methods work by assigning a label based on the candidate annotation with the highest alignment score; this can overstate annotation certainty, mislabel boundaries, and fails to identify large scale rearrangements or insertions within the annotated sequence. Here, I present a new software tool, PolyA, that adjudicates between competing alignment-based annotations by computing …


Soda: An Open-Source Library For Visualizing Biological Sequence Annotation, Jack W. Roddy, Travis J. Wheeler Jan 2021

Soda: An Open-Source Library For Visualizing Biological Sequence Annotation, Jack W. Roddy, Travis J. Wheeler

Graduate Student Theses, Dissertations, & Professional Papers

Genome annotation is the process of identifying and labeling known genetic sequences or features within a genome. Across the various subfields within modern molecular biology, there is a common need for the visualization of such annotations. Genomic data is often visualized on web browser platforms, providing users with easy access to visualization tools without the need for installing any software or, in many cases, underlying datasets. While there exists a broad range of web-based visualization tools, there is, to my knowledge, no lightweight, modern library tailored towards the visualization of genomic data. Instead, developers charged with the task of producing …


Genomic Inference Of Inbreeding In Alexander Archipelago Wolves (Canis Lupus Ligoni) On Prince Of Wales Island, Southeast Alaska, Katherine Emily Zarn Jan 2019

Genomic Inference Of Inbreeding In Alexander Archipelago Wolves (Canis Lupus Ligoni) On Prince Of Wales Island, Southeast Alaska, Katherine Emily Zarn

Graduate Student Theses, Dissertations, & Professional Papers

Habitat loss and climate change are increasingly resulting in reduction and fragmentation of wildlife populations. Populations that have experienced fragmentation and decreases in abundance are at heightened risk of inbreeding due to reduced opportunities to mate with unrelated conspecifics. Prolonged or extensive inbreeding can result in inbreeding depression via the exposure of deleterious alleles in long runs of homozygosity. Alexander Archipelago wolves (Canis lupus ligoni) on Prince of Wales Island (POW) in Southeast Alaska are a small, isolated population of conservation concern that have experienced habitat loss and high harvest rates, and present an ideal system in which …


An Interdisciplinary Approach To The Target Elucidation Of Novel Antibiotic 31g12, Larissa A. Walker Jan 2018

An Interdisciplinary Approach To The Target Elucidation Of Novel Antibiotic 31g12, Larissa A. Walker

Graduate Student Theses, Dissertations, & Professional Papers

Staphylococcus aureus is a Gram-positive bacterial pathogen responsible for nosocomial and community-acquired infections that can quickly acquire antibiotic resistance. We have identified a novel triazole antimicrobial 31G12 based on the natural product core of nonactin isolated from the fermentation of Streptomyces griseus, that is active against many Gram-positive bacteria as well as antibiotic resistant methicillin-resistant S. aureus and vancomycin-resistant Enterococcus. The synthesis and characterization indicate that 31G12 exists as a mixture of two rotamers at room temperature and displays bacteriostatic activity against S. aureus with moderate mammalian cell toxicity. We have currently identified potential protein targets of 31G12 in …


Molecular Diversity Of Foliar Fungal Endophytes In Relation To Defense Strategies And Disease In Whitebark Pine, Lorinda Bullington Jan 2017

Molecular Diversity Of Foliar Fungal Endophytes In Relation To Defense Strategies And Disease In Whitebark Pine, Lorinda Bullington

Graduate Student Theses, Dissertations, & Professional Papers

An invasive fungal pathogen, Cronartium ribicola (the causative agent of white pine blister rust) infects and kills whitebark pine (Pinus albicaulis) throughout the western US. Blister rust has decreased whitebark pine populations by over 90% in some areas. Whitebark pine, a keystone species, has been proposed for listing under the Endangered Species Act in the U.S., and the loss of this conifer is predicted to have severe impacts on forest composition and function in high elevations. Hundreds of asymptomatic fungal species live inside whitebark pine tissue, and recent studies suggest that these fungi can influence the frequency and …


K-Mer Analysis Pipeline For Classification Of Dna Sequences From Metagenomic Samples, Russell Kaehler Jan 2017

K-Mer Analysis Pipeline For Classification Of Dna Sequences From Metagenomic Samples, Russell Kaehler

Graduate Student Theses, Dissertations, & Professional Papers

Biological sequence datasets are increasing at a prodigious rate. The volume of data in these datasets surpasses what is observed in many other fields of science. New developments wherein metagenomic DNA from complex bacterial communities is recovered and sequenced are producing a new kind of data known as metagenomic data, which is comprised of DNA fragments from many genomes. Developing a utility to analyze such metagenomic data and predict the sample class from which it originated has many possible implications for ecological and medical applications. Within this document is a description of a series of analytical techniques used to process …


High-Resolution Mapping Of Hierarchical Greater Sage-Grouse Nesting Habitat: A Grain-Spectrum Approach In Northwestern Wyoming, Robert T. Haynam Iii Jan 2017

High-Resolution Mapping Of Hierarchical Greater Sage-Grouse Nesting Habitat: A Grain-Spectrum Approach In Northwestern Wyoming, Robert T. Haynam Iii

Graduate Student Theses, Dissertations, & Professional Papers

Our overall objective was to create a probabilistic nesting-habitat map for the Jackson Hole sage-grouse population that would have utility as a tool for future research, conservation, and management. The models that we developed for this purpose were specified to evaluate whether sage-grouse may be selecting nesting-habitat characteristics simultaneously at various spatial scales. Our spatially-explicit landscape-scale research was implemented primarily with readily available National Agriculture Imagery Program (NAIP) data. All nesting data was collected from 2007-2010. We tested how a broad range of grain sizes (spatial resolution) of covariate values affected the fit to logistic regression models used to estimate …


Discordant Classification Of Transposable Elements In Segmental Duplications Raise Concerns About Subfamily Definitions, Gilia R. Patterson Jan 2016

Discordant Classification Of Transposable Elements In Segmental Duplications Raise Concerns About Subfamily Definitions, Gilia R. Patterson

Undergraduate Theses, Professional Papers, and Capstone Artifacts

Most of the human genome comes from transposable elements (TEs), sequences of DNA that can move and insert copies of themselves throughout the genome. TE sequences both inform and complicate analyses of genomes, so it is important that TEs are annotated completely and accurately. Remnants of TEs are annotated and classified into subfamilies based on their DNA sequences. A subfamily represents all the copies generated in a burst of replication by a few closely related TEs. Wacholder et al. (2014) suggested that the current methods for representing subfamilies are not accurate and should be reevaluated. We expand on this discussion …


A Case Study Tested Framework For Multivariate Analyses Of Microbiomes: Software For Microbial Community Comparisons, Eric M. Spaulding Jan 2015

A Case Study Tested Framework For Multivariate Analyses Of Microbiomes: Software For Microbial Community Comparisons, Eric M. Spaulding

Graduate Student Theses, Dissertations, & Professional Papers

The study of microbiomes is important because our understanding of microbial communities is providing insight into human health and many other areas of interest. Researchers often use genomic data to study microbial organisms, demonstrating differences from one organism to the next. Metagenomic data is utilized to study communities of microbial organisms. The research described herein involved the development of a collection of computational methods.

This suite of computational methods and tools (written in the R and Perl languages) has become a framework used for metagenomic data analysis and result visualization. Multivariate analyses such as Linear Discriminate Analysis (LDA) are used …


Developing Microbial Biomarkers To Non-Invasively Assess Health In Wild Elk (Cervus Canadensis) Populations, Samuel B. Pannoni Jan 2015

Developing Microbial Biomarkers To Non-Invasively Assess Health In Wild Elk (Cervus Canadensis) Populations, Samuel B. Pannoni

Undergraduate Theses, Professional Papers, and Capstone Artifacts

The composition of the intestinal bacterial community (intestinal microbiome) of mammals is associated with changes in diet, stress, disease and physical condition of the animal. The relationship between health and the microbiome has been extensively demonstrated in studies of humans and mice; this provides strong support for its potential utility in wildlife. When managing elk (Cervus canadensis), federal and state agencies currently must rely on invasive sampling and coarse demographic data on which to base their decisions. By developing microbiome-based biomarkers that vary as a function of elk body condition and disease (i.e. microbial biomarkers), we hope to …