Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Life Sciences

An Investigation Of Information Structures In Dna, Joel Mohrmann May 2024

An Investigation Of Information Structures In Dna, Joel Mohrmann

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

The information-containing nature of the DNA molecule has been long known and observed. One technique for quantifying the relationships existing within the information contained in DNA sequences is an entity from information theory known as the average mutual information (AMI) profile. This investigation sought to use principally the AMI profile along with a few other metrics to explore the structure of the information contained in DNA sequences.

Treating DNA sequences as an information source, several computational methods were employed to model their information structure. Maximum likelihood and maximum a posteriori estimators were used to predict missing bases in DNA sequences. …


Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone Nov 2023

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone

Complex Biosystems PhD Program: Dissertations

The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …


A Review Of How Bioinformatics And Genome Sequencing Are Affecting Precision Medicine, Taylor S. Hickey May 2023

A Review Of How Bioinformatics And Genome Sequencing Are Affecting Precision Medicine, Taylor S. Hickey

Honors Theses

Advancement in genomic sequencing and bioinformatics methods have been affecting biomedical research through precision medicine, especially in the area of cancer. Vaccine therapies can be developed using neoantigens that target specific mutations in tumors. The goals of this research are to identify mutations that lead to cancer and then define subpopulations in which patients can easily be identified. The future goal is to have targeted vaccines that are specific to each subpopulation ready to be used in treatment of their cancer. Limitations to reaching these goals have been due to tumor heterogeneity, cancer location, and difficulty in creating neoantigens for …


Comparative Analyses Of De Novo Transcriptome Assembly Pipelines For Diploid Wheat, Natasha Pavlovikj May 2022

Comparative Analyses Of De Novo Transcriptome Assembly Pipelines For Diploid Wheat, Natasha Pavlovikj

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Gene expression and transcriptome analysis are currently one of the main focuses of research for a great number of scientists. However, the assembly of raw sequence data to obtain a draft transcriptome of an organism is a complex multi-stage process usually composed of pre-processing, assembling, and post-processing. Each of these stages includes multiple steps such as data cleaning, error correction and assembly validation. Different combinations of steps, as well as different computational methods for the same step, generate transcriptome assemblies with different accuracy. Thus, using a combination that generates more accurate assemblies is crucial for any novel biological discoveries. Implementing …


Polerovirus Genomic Variation And Mechanisms Of Silencing Suppression By P0 Protein, Natalie Holste Nov 2020

Polerovirus Genomic Variation And Mechanisms Of Silencing Suppression By P0 Protein, Natalie Holste

School of Biological Sciences: Dissertations, Theses, and Student Research

The family Luteoviridae consists of three genera: Luteovirus, Enamovirus, and Polerovirus. The genus Polerovirus contains 32 virus species. All are transmitted by aphids and can infect a wide variety of crops from cereals and wheat to cucurbits and peppers. However, little is known about how this wide range of hosts and vectors developed. In poleroviruses, aphid transmission and virion formation is mediated by the coat protein read-through domain (CPRT) while silencing suppression and phloem limitation is mediated by Protein 0 (P0)—a protein unique to poleroviruses. P0 gives poleroviruses a great advantage amongst plant viruses and diversifies polerovirus species, but the …


The N-Glycan Structures Of The Antigenic Variants Of Chlorovirus Pbcv-1 Major Capsid Protein Help To Identify The Virus-Encoded Glycosyltransferases, Immacolata Speciale, Garry A. Duncan, Luca Unione, Irina Agarkova, Domenico Garozzo, Jesus Jimenez-Barbero, Sicheng Lin, Todd L. Lowary, Antonio Molinaro, Eric Noel, Maria Elena Laugieri, Michela Tonetti, James L. Van Etten, Cristina De Castro Jan 2019

The N-Glycan Structures Of The Antigenic Variants Of Chlorovirus Pbcv-1 Major Capsid Protein Help To Identify The Virus-Encoded Glycosyltransferases, Immacolata Speciale, Garry A. Duncan, Luca Unione, Irina Agarkova, Domenico Garozzo, Jesus Jimenez-Barbero, Sicheng Lin, Todd L. Lowary, Antonio Molinaro, Eric Noel, Maria Elena Laugieri, Michela Tonetti, James L. Van Etten, Cristina De Castro

James Van Etten Publications

The chlorovirus Paramecium bursaria chlorella virus 1 (PBCV-1) is a large dsDNA virus that infects the microalga Chlorella variabilis NC64A. Unlike most other viruses, PBCV-1 encodes most, if not all, of the machinery required to glycosylate its major capsid protein (MCP). The structures of the four N-linked glycans from the PBCV-1 MCP consist of nonasaccharides, and similar glycans are not found elsewhere in the three domains of life. Here, we identified the roles of three virus-encoded glycosyltransferases (GTs) that have four distinct GT activities in glycan synthesis. Two of the three GTs were previously annotated as GTs but the third …


Copy Number Variation In The Porcine Genome Detected From Whole-Genome Sequence, Rebecca Anderson Mar 2018

Copy Number Variation In The Porcine Genome Detected From Whole-Genome Sequence, Rebecca Anderson

Honors Theses

Copy number variations (CNVs) are large insertions, deletions, and duplications in the genome that vary between individuals in a species. These variations are known to impact a broad range of phenotypes from molecular-level traits to higher-order clinical phenotypes. CNVs have been linked to complex traits in humans such as autism, attention deficit hyperactivity disorder, nervous system disorders, and early-onset extreme obesity. In this study, whole-genome sequence was obtained from 72 founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC) in Clay Center, Nebraska. This included 24 boars (12 Duroc and 12 Landrace) and …


Classification Of Genomic Sequences By Latent Semantic Analysis, Samuel F. Way Aug 2012

Classification Of Genomic Sequences By Latent Semantic Analysis, Samuel F. Way

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Evolutionary distance measures provide a means of identifying and organizing related organisms by comparing their genomic sequences. As such, techniques that quantify the level of similarity between DNA sequences are essential in our efforts to decipher the genetic code in which they are written.

Traditional methods for estimating the evolutionary distance separating two genomic sequences often require that the sequences first be aligned before they are compared. Unfortunately, this preliminary step imposes great computational burden, making this class of techniques impractical for applications involving a large number of sequences. Instead, we desire new methods for differentiating genomic sequences that eliminate …