Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Mathematics

Institution
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 32

Full-Text Articles in Computational Biology

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone Nov 2023

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone

Complex Biosystems PhD Program: Dissertations

The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …


Modeling Nonsegmented Negative-Strand Rna Virus (Nnsv) Transcription With Ejective Polymerase Collisions And Biased Diffusion, Felipe-Andres Piedra Sep 2023

Modeling Nonsegmented Negative-Strand Rna Virus (Nnsv) Transcription With Ejective Polymerase Collisions And Biased Diffusion, Felipe-Andres Piedra

Research Symposium

Background: The textbook model of NNSV transcription predicts a gene expression gradient. However, multiple studies show non-gradient gene expression patterns or data inconsistent with a simple gradient. Regarding the latter, several studies show a dramatic decrease in gene expression over the last two genes of the respiratory syncytial virus (RSV) genome (a highly studied NNSV). The textbook model cannot explain these phenomena.

Methods: Computational models of RSV and vesicular stomatitis virus (VSV – another highly studied NNSV) transcription were written in the Python programming language using the Scientific Python Development Environment. The model code is freely available on GitHub: …


An Implementation Of The Method Of Moments On Chemical Systems With Constant And Time-Dependent Rates, Emmanuel O. Adara, Roger B. Sidje Sep 2023

An Implementation Of The Method Of Moments On Chemical Systems With Constant And Time-Dependent Rates, Emmanuel O. Adara, Roger B. Sidje

Northeast Journal of Complex Systems (NEJCS)

Among numerical techniques used to facilitate the analysis of biochemical reactions, we can use the method of moments to directly approximate statistics such as the mean numbers of molecules. The method is computationally viable in time and memory, compared to solving the chemical master equation (CME) which is notoriously expensive. In this study, we apply the method of moments to a chemical system with a constant rate representing a vascular endothelial growth factor (VEGF) model, as well as another system with time-dependent propensities representing the susceptible, infected, and recovered (SIR) model with periodic contact rate. We assess the accuracy of …


Symmetry-Inspired Analysis Of Biological Networks, Ian Leifer Jun 2022

Symmetry-Inspired Analysis Of Biological Networks, Ian Leifer

Dissertations, Theses, and Capstone Projects

The description of a complex system like gene regulation of a cell or a brain of an animal in terms of the dynamics of each individual element is an insurmountable task due to the complexity of interactions and the scores of associated parameters. Recent decades brought about the description of these systems that employs network models. In such models the entire system is represented by a graph encapsulating a set of independently functioning objects and their interactions. This creates a level of abstraction that makes the analysis of such large scale system possible. Common practice is to draw conclusions about …


Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis Aug 2020

Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis

Dissertations & Theses (Open Access)

Tumor cells have heterogeneous genotypes, which drives progression and treatment resistance. Such genetic intratumor heterogeneity plays a role in the process of clonal evolution that underlies tumor progression and treatment resistance. Single-cell DNA sequencing is a promising experimental method for studying intratumor heterogeneity, but brings unique statistical challenges in interpreting the resulting data. Researchers lack methods to determine whether sufficiently many cells have been sampled from a tumor. In addition, there are no proven computational methods for determining the ploidy of a cell, a necessary step in the determination of copy number. In this work, software for calculating probabilities from …


Network Structure And Dynamics Of Biological Systems, Deena R. Schmidt Oct 2019

Network Structure And Dynamics Of Biological Systems, Deena R. Schmidt

Annual Symposium on Biomathematics and Ecology Education and Research

No abstract provided.


Topology And Dynamics Of Gene Regulatory Networks: A Meta-Analysis, Claus Kadelka May 2019

Topology And Dynamics Of Gene Regulatory Networks: A Meta-Analysis, Claus Kadelka

Biology and Medicine Through Mathematics Conference

No abstract provided.


Do Metabolic Networks Follow A Power Law? A Psamm Analysis, Ryan Geib, Lubos Thoma, Ying Zhang May 2019

Do Metabolic Networks Follow A Power Law? A Psamm Analysis, Ryan Geib, Lubos Thoma, Ying Zhang

Senior Honors Projects

Inspired by the landmark paper “Emergence of Scaling in Random Networks” by Barabási and Albert, the field of network science has focused heavily on the power law distribution in recent years. This distribution has been used to model everything from the popularity of sites on the World Wide Web to the number of citations received on a scientific paper. The feature of this distribution is highlighted by the fact that many nodes (websites or papers) have few connections (internet links or citations) while few “hubs” are connected to many nodes. These properties lead to two very important observed effects: the …


Recurrent Neural Networks And Their Applications To Rna Secondary Structure Inference, Devin Willmott Jan 2018

Recurrent Neural Networks And Their Applications To Rna Secondary Structure Inference, Devin Willmott

Theses and Dissertations--Mathematics

Recurrent neural networks (RNNs) are state of the art sequential machine learning tools, but have difficulty learning sequences with long-range dependencies due to the exponential growth or decay of gradients backpropagated through the RNN. Some methods overcome this problem by modifying the standard RNN architecure to force the recurrent weight matrix W to remain orthogonal throughout training. The first half of this thesis presents a novel orthogonal RNN architecture that enforces orthogonality of W by parametrizing with a skew-symmetric matrix via the Cayley transform. We present rules for backpropagation through the Cayley transform, show how to deal with the Cayley …


Linking Taxonomic Diversity And Trophic Function: A Graph-Based Theoretical Approach, Marcella M. Jurotich, Kaitlyn Dougherty, Barbara Hayford, Sally Clark Nov 2017

Linking Taxonomic Diversity And Trophic Function: A Graph-Based Theoretical Approach, Marcella M. Jurotich, Kaitlyn Dougherty, Barbara Hayford, Sally Clark

Transactions of the Nebraska Academy of Sciences and Affiliated Societies

The purpose of this study is to develop a novel, visual method in analyzing complex functional trait data in freshwater ecology. We focus on macroinvertebrates in stream ecosystems under a gradient of habitat degradation and employ a combination of taxonomic and functional trait diversity analyses. Then we use graph theory to link changes in functional trait diversity to taxonomic richness and habitat degradation. We test the hypotheses that: 1) taxonomic diversity and trophic functional trait diversity both decrease with increased habitat degradation; 2) loss of taxa leads to a decrease in trophic function as visualized using a bipartite graph; and …


Network Analytics For The Mirna Regulome And Mirna-Disease Interactions, Joseph Jayakar Nalluri Jan 2017

Network Analytics For The Mirna Regulome And Mirna-Disease Interactions, Joseph Jayakar Nalluri

Theses and Dissertations

miRNAs are non-coding RNAs of approx. 22 nucleotides in length that inhibit gene expression at the post-transcriptional level. By virtue of this gene regulation mechanism, miRNAs play a critical role in several biological processes and patho-physiological conditions, including cancers. miRNA behavior is a result of a multi-level complex interaction network involving miRNA-mRNA, TF-miRNA-gene, and miRNA-chemical interactions; hence the precise patterns through which a miRNA regulates a certain disease(s) are still elusive. Herein, I have developed an integrative genomics methods/pipeline to (i) build a miRNA regulomics and data analytics repository, (ii) create/model these interactions into networks and use optimization techniques, motif …


Dynamics Of Gene Networks In Cancer Research, Paul Scott Jan 2017

Dynamics Of Gene Networks In Cancer Research, Paul Scott

Electronic Theses and Dissertations

Cancer prevention treatments are being researched to see if an optimized treatment schedule would decrease the likelihood of a person being diagnosed with cancer. To do this we are looking at genes involved in the cell cycle and how they interact with one another. Through each gene expression during the life of a normal cell we get an understanding of the gene interactions and test these against those of a cancerous cell. First we construct a simplified network model of the normal gene network. Once we have this model we translate it into a transition matrix and force changes on …


Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang Feb 2016

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for …


Evolution Of Mobile Promoters In Prokaryotic Genomes., Mahnaz Rabbani Oct 2015

Evolution Of Mobile Promoters In Prokaryotic Genomes., Mahnaz Rabbani

Electronic Thesis and Dissertation Repository

Mobile genetic elements are important factors in evolution, and greatly influence the structure of genomes, facilitating the development of new adaptive characteristics. The dynamics of these mobile elements can be described using various mathematical and statistical models. In this thesis, we focus on a specific category of mobile genetic elements, i.e. mobile promoters, which are mobile regions of DNA that initiate the transcription of genes. We present a class of mathematical models for the evolution of mobile promoters in prokaryotic genomes, based on data obtained from available sequenced genomes. Our novel location-based model incorporates two biologically meaningful regions of the …


Bioinformatic Game Theory And Its Application To Cluster Multi-Domain Proteins, Brittney Keel May 2015

Bioinformatic Game Theory And Its Application To Cluster Multi-Domain Proteins, Brittney Keel

Department of Mathematics: Dissertations, Theses, and Student Research

The exact evolutionary history of any set of biological sequences is unknown, and all phylogenetic reconstructions are approximations. The problem becomes harder when one must consider a mix of vertical and lateral phylogenetic signals. In this dissertation we propose a game-theoretic approach to clustering biological sequences and analyzing their evolutionary histories. In this context we use the term evolution as a broad descriptor for the entire set of mechanisms driving the inherited characteristics of a population. The key assumption in our development is that evolution tries to accommodate the competing forces of selection, of which the conservation force seeks to …


Modeling Neurovascular Coupling From Clustered Parameter Sets For Multimodal Eeg-Nirs, M. Tanveer Talukdar, H. Robert Frost, Solomon G. G. Diamond Feb 2015

Modeling Neurovascular Coupling From Clustered Parameter Sets For Multimodal Eeg-Nirs, M. Tanveer Talukdar, H. Robert Frost, Solomon G. G. Diamond

Dartmouth Scholarship

Despite significant improvements in neuroimaging technologies and analysis methods, the fundamental relationship between local changes in cerebral hemodynamics and the underlying neural activity remains largely unknown. In this study, a data driven approach is proposed for modeling this neurovascular coupling relationship from simultaneously acquired electroencephalographic (EEG) and near-infrared spectroscopic (NIRS) data. The approach uses gamma transfer functions to map EEG spectral envelopes that reflect time-varying power variations in neural rhythms to hemodynamics measured with NIRS during median nerve stimulation. The approach is evaluated first with simulated EEG-NIRS data and then by applying the method to experimental EEG-NIRS data measured from …


Epistasis In Predator-Prey Relationships, Iuliia Inozemtseva Jan 2014

Epistasis In Predator-Prey Relationships, Iuliia Inozemtseva

Electronic Theses and Dissertations

Epistasis is the interaction between two or more genes to control a single phenotype. We model epistasis of the prey in a two-locus two-allele problem in a basic predator- prey relationship. The resulting model allows us to examine both population sizes as well as genotypic and phenotypic frequencies. In the context of several numerical examples, we show that if epistasis results in an undesirable or desirable phenotype in the prey by making the particular genotype more or less susceptible to the predator or dangerous to the predator, elimination of undesirable phenotypes and then genotypes occurs.


On The Global Stability Of A Generalized Cholera Epidemiological Model, Yuanji Cheng, Jin Wang, Xiuxiang Yang Jan 2012

On The Global Stability Of A Generalized Cholera Epidemiological Model, Yuanji Cheng, Jin Wang, Xiuxiang Yang

Mathematics & Statistics Faculty Publications

In this paper, we conduct a careful global stability analysis for a generalized cholera epidemiological model originally proposed in [J. Wang and S. Liao, A generalized cholera model and epidemic/endemic analysis, J. Biol. Dyn. 6 (2012), pp. 568-589]. Cholera is a water-and food-borne infectious disease whose dynamics are complicated by the multiple interactions between the human host, the pathogen, and the environment. Using the geometric approach, we rigorously prove the endemic global stability for the cholera model in three-dimensional (when the pathogen component is a scalar) and four-dimensional (when the pathogen component is a vector) systems. This work unifies the …


Stability Analysis And Application Of A Mathematical Cholera Model, Shu Liao, Jim Wang Jul 2011

Stability Analysis And Application Of A Mathematical Cholera Model, Shu Liao, Jim Wang

Mathematics & Statistics Faculty Publications

In this paper, we conduct a dynamical analysis of the deterministic cholera model proposed in [9]. We study the stability of both the disease-free and endemic equilibria so as to explore the complex epidemic and endemic dynamics of the disease. We demonstrate a real-world application of this model by investigating the recent cholera outbreak in Zimbabwe. Meanwhile, we present numerical simulation results to verify the analytical predictions.


Preliminary Analysis Of An Agent-Based Model For A Tick-Borne Disease, Holly Gaff Apr 2011

Preliminary Analysis Of An Agent-Based Model For A Tick-Borne Disease, Holly Gaff

Biological Sciences Faculty Publications

Ticks have a unique life history including a distinct set of life stages and a single blood meal per life stage. This makes tick-host interactions more complex from a mathematical perspective. In addition, any model of these interactions must involve a significant degree of stochasticity on the individual tick level. In an attempt to quantify these relationships, I have developed an individual-based model of the interactions between ticks and their hosts as well as the transmission of tick-borne disease between the two populations. The results from this model are compared with those from previously published differential equation based population models. …


Computational Biology, Harvey Greenberg, Allen Holder Nov 2010

Computational Biology, Harvey Greenberg, Allen Holder

Mathematical Sciences Technical Reports (MSTR)

Computational biology is an interdisciplinary field that applies the techniques of computer science, applied mathematics, and statistics to address biological questions. OR is also interdisciplinary and applies the same mathematical and computational sciences, but to decision-making problems. Both focus on developing mathematical models and designing algorithms to solve them. Models in computational biology vary in their biological domain and can range from the interactions of genes and proteins to the relationships among organisms and species.


G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg Aug 2010

G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg

Mathematical Sciences Technical Reports (MSTR)

We look at the Pure Parsimony problem and the Perfect Phylogeny Haplotyping problem. From the Pure Parsimony problem we consider structures of genotypes called g-lattices. These structures either provide solutions or give bounds to the pure parsimony problem. In particular, we investigate which of these structures supports an unrooted perfect phylogeny, a condition that adds biological interpretation. By understanding which g-lattices support an unrooted perfect phylogeny, we connect two of the standard biological inference rules used to recreate how genetic diversity propagates across generations.


Model-Based Clustering Of Methylation Array Data: A Recursive-Partitioning Algorithm For High-Dimensional Data Arising As A Mixture Of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, Karl T. Kelsey Jun 2008

Model-Based Clustering Of Methylation Array Data: A Recursive-Partitioning Algorithm For High-Dimensional Data Arising As A Mixture Of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, Karl T. Kelsey

Harvard University Biostatistics Working Paper Series

No abstract provided.


Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li Jul 2007

Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li

Harvard University Biostatistics Working Paper Series

Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time …


Cluster Analysis Of Genomic Data With Applications In R, Katherine S. Pollard, Mark J. Van Der Laan Jan 2005

Cluster Analysis Of Genomic Data With Applications In R, Katherine S. Pollard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in R. We discuss statistical issues and methods in choosing the number of clusters, the choice of clustering algorithm, and the choice of dissimilarity matrix. In particular, we illustrate how the bootstrap can be employed as a statistical method in cluster analysis to establish the reproducibility of the clusters and the overall variability of the followed procedure. We also show how to visualize a clustering result by plotting ordered dissimilarity matrices in R. We present a new R package, hopach, which implements the hybrid clustering method, …


Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh Oct 2004

Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

One of the benefits of profiling of cancer samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Such subgroups have typically been found in microarray data using hierarchical clustering. A major problem in interpretation of the output is determining the number of clusters. We approach the problem of determining disease subtypes using mixture models. A novel estimation procedure of the parameters in the mixture model is developed based on a combination of random projections and the expectation-maximization algorithm. Because the approach is probabilistic, our approach provides a measure for the number of true clusters …


Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman Jun 2004

Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman

Bioconductor Project Working Papers

A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. We discuss different approaches to this task and illustrate how they can be applied using software from the Bioconductor Project. A central problem is the high dimensionality of gene expression space, which prohibits a comprehensive statistical analysis without focusing on particular aspects of the joint distribution of the genes expression levels. Possible strategies are to do univariate gene-by-gene analysis, and to perform data-driven nonspecific filtering of genes before the actual statistical analysis. …


Statistical Analyses And Reproducible Research, Robert Gentleman, Duncan Temple Lang May 2004

Statistical Analyses And Reproducible Research, Robert Gentleman, Duncan Temple Lang

Bioconductor Project Working Papers

For various reasons, it is important, if not essential, to integrate the computations and code used in data analyses, methodological descriptions, simulations, etc. with the documents that describe and rely on them. This integration allows readers to both verify and adapt the statements in the documents. Authors can easily reproduce them in the future, and they can present the document's contents in a different medium, e.g. with interactive controls. This paper describes a software framework for authoring and distributing these integrated, dynamic documents that contain text, code, data, and any auxiliary content needed to recreate the computations. The documents are …


Bioconductor: Open Software Development For Computational Biology And Bioinformatics, Robert C. Gentleman, Vincent J. Carey, Douglas J. Bates, Benjamin M. Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Guenther Sawitzki, Colin Smith, Gordon K. Smyth, Luke Tierney, Yee Hwa Yang, Jianhua Zhang Jan 2004

Bioconductor: Open Software Development For Computational Biology And Bioinformatics, Robert C. Gentleman, Vincent J. Carey, Douglas J. Bates, Benjamin M. Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Guenther Sawitzki, Colin Smith, Gordon K. Smyth, Luke Tierney, Yee Hwa Yang, Jianhua Zhang

Bioconductor Project Working Papers

The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. We detail some of the design decisions, software paradigms and operational strategies that have allowed a small number of researchers to provide a wide variety of innovative, extensible, software solutions in a relatively short time. The use of an object oriented programming paradigm, the adoption and development of a software package system, designing by contract, distributed development and collaboration with other projects are elements of this project's success. Individually, each of these concepts are useful and important but when combined they have …


Computational Protein Biomarker Prediction: A Case Study For Prostate Cancer, Michael Wagner, Dayanand N. Naik, Alex Pothen, Srinivas Kasukurti, Raghu Ram Devineni, Bao-Ling Adam, O. John Semmes, George L. Wright Jr. Jan 2004

Computational Protein Biomarker Prediction: A Case Study For Prostate Cancer, Michael Wagner, Dayanand N. Naik, Alex Pothen, Srinivas Kasukurti, Raghu Ram Devineni, Bao-Ling Adam, O. John Semmes, George L. Wright Jr.

Mathematics & Statistics Faculty Publications

Background: Recent technological advances in mass spectrometry pose challenges in computational mathematics and statistics to process the mass spectral data into predictive models with clinical and biological significance. We discuss several classification-based approaches to finding protein biomarker candidates using protein profiles obtained via mass spectrometry, and we assess their statistical significance. Our overall goal is to implicate peaks that have a high likelihood of being biologically linked to a given disease state, and thus to narrow the search for biomarker candidates.

Results: Thorough cross-validation studies and randomization tests are performed on a prostate cancer dataset with over 300 patients, obtained …