Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Computational Biology

Creation Of A Digital Storage System For Genome Sequencing Metadata, Jacquelin W. Olexa Jan 2024

Creation Of A Digital Storage System For Genome Sequencing Metadata, Jacquelin W. Olexa

Undergraduate Theses, Professional Papers, and Capstone Artifacts

As the field of computational genomics continues to expand in both potential and application, it is now more imperative than ever to ensure that massive genetic sequencing datasets are properly stored in an accessible manner. This project sought to establish a practical, user-friendly, secure system for a genomics research lab (the Good Lab; thegoodlab.org) at the University of Montana. A MySQL database and connected web application was ruled the best configuration to maximize utility and accessibility for the lab’s researchers. Building the logical framework for the database, creating the server, and sourcing data occurred over several months. The dataset ranged …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …


K-Mer Analysis Pipeline For Classification Of Dna Sequences From Metagenomic Samples, Russell Kaehler Jan 2017

K-Mer Analysis Pipeline For Classification Of Dna Sequences From Metagenomic Samples, Russell Kaehler

Graduate Student Theses, Dissertations, & Professional Papers

Biological sequence datasets are increasing at a prodigious rate. The volume of data in these datasets surpasses what is observed in many other fields of science. New developments wherein metagenomic DNA from complex bacterial communities is recovered and sequenced are producing a new kind of data known as metagenomic data, which is comprised of DNA fragments from many genomes. Developing a utility to analyze such metagenomic data and predict the sample class from which it originated has many possible implications for ecological and medical applications. Within this document is a description of a series of analytical techniques used to process …