Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- COBRA (10)
- Old Dominion University (4)
- University of Nebraska - Lincoln (3)
- Georgia Southern University (2)
- Rose-Hulman Institute of Technology (2)
-
- Virginia Commonwealth University (2)
- Binghamton University (1)
- City University of New York (CUNY) (1)
- Dartmouth College (1)
- Illinois State University (1)
- The Texas Medical Center Library (1)
- University of Kentucky (1)
- University of Rhode Island (1)
- University of Texas Rio Grande Valley (1)
- Western University (1)
- Keyword
-
- Bioinformatics (4)
- Computational biology (3)
- Bootstrap (2)
- Gene expression (2)
- Graph theory (2)
-
- Machine learning (2)
- Agent-based model (1)
- Applied math (1)
- Beta mixture; DNA methylation; cancer; epigenetics; mixture model (1)
- Biased diffusion (1)
- Biodiversity loss (1)
- Bioinformatic game theory (1)
- Biological metadata (1)
- Biology (1)
- Biomedical signal processing (1)
- Biophysics (1)
- Bombay Phenotype (1)
- Cancer (1)
- Cerebrovascular circulation (1)
- Chemical Langevin Equation (1)
- Chemical Master Equation (1)
- Cholera model (1)
- Cholera modelling (1)
- Clonal evolution (1)
- Clustering (1)
- Compendium (1)
- Computational modeling (1)
- Computer simulation (1)
- Computing (1)
- Convolutional neural network (1)
- Publication Year
- Publication
-
- Bioconductor Project Working Papers (3)
- Mathematics & Statistics Faculty Publications (3)
- Electronic Theses and Dissertations (2)
- Harvard University Biostatistics Working Paper Series (2)
- Mathematical Sciences Technical Reports (MSTR) (2)
-
- UW Biostatistics Working Paper Series (2)
- Annual Symposium on Biomathematics and Ecology Education and Research (1)
- Biological Sciences Faculty Publications (1)
- Biology and Medicine Through Mathematics Conference (1)
- COBRA Preprint Series (1)
- Complex Biosystems PhD Program: Dissertations (1)
- Dartmouth Scholarship (1)
- Department of Mathematics: Dissertations, Theses, and Student Research (1)
- Dissertations & Theses (Open Access) (1)
- Dissertations, Theses, and Capstone Projects (1)
- Electronic Thesis and Dissertation Repository (1)
- Northeast Journal of Complex Systems (NEJCS) (1)
- Research Symposium (1)
- Senior Honors Projects (1)
- The University of Michigan Department of Biostatistics Working Paper Series (1)
- Theses and Dissertations (1)
- Theses and Dissertations--Mathematics (1)
- Transactions of the Nebraska Academy of Sciences and Affiliated Societies (1)
- U.C. Berkeley Division of Biostatistics Working Paper Series (1)
- Publication Type
Articles 1 - 30 of 32
Full-Text Articles in Computational Biology
Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone
Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone
Complex Biosystems PhD Program: Dissertations
The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …
Modeling Nonsegmented Negative-Strand Rna Virus (Nnsv) Transcription With Ejective Polymerase Collisions And Biased Diffusion, Felipe-Andres Piedra
Modeling Nonsegmented Negative-Strand Rna Virus (Nnsv) Transcription With Ejective Polymerase Collisions And Biased Diffusion, Felipe-Andres Piedra
Research Symposium
Background: The textbook model of NNSV transcription predicts a gene expression gradient. However, multiple studies show non-gradient gene expression patterns or data inconsistent with a simple gradient. Regarding the latter, several studies show a dramatic decrease in gene expression over the last two genes of the respiratory syncytial virus (RSV) genome (a highly studied NNSV). The textbook model cannot explain these phenomena.
Methods: Computational models of RSV and vesicular stomatitis virus (VSV – another highly studied NNSV) transcription were written in the Python programming language using the Scientific Python Development Environment. The model code is freely available on GitHub: …
An Implementation Of The Method Of Moments On Chemical Systems With Constant And Time-Dependent Rates, Emmanuel O. Adara, Roger B. Sidje
An Implementation Of The Method Of Moments On Chemical Systems With Constant And Time-Dependent Rates, Emmanuel O. Adara, Roger B. Sidje
Northeast Journal of Complex Systems (NEJCS)
Among numerical techniques used to facilitate the analysis of biochemical reactions, we can use the method of moments to directly approximate statistics such as the mean numbers of molecules. The method is computationally viable in time and memory, compared to solving the chemical master equation (CME) which is notoriously expensive. In this study, we apply the method of moments to a chemical system with a constant rate representing a vascular endothelial growth factor (VEGF) model, as well as another system with time-dependent propensities representing the susceptible, infected, and recovered (SIR) model with periodic contact rate. We assess the accuracy of …
Symmetry-Inspired Analysis Of Biological Networks, Ian Leifer
Symmetry-Inspired Analysis Of Biological Networks, Ian Leifer
Dissertations, Theses, and Capstone Projects
The description of a complex system like gene regulation of a cell or a brain of an animal in terms of the dynamics of each individual element is an insurmountable task due to the complexity of interactions and the scores of associated parameters. Recent decades brought about the description of these systems that employs network models. In such models the entire system is represented by a graph encapsulating a set of independently functioning objects and their interactions. This creates a level of abstraction that makes the analysis of such large scale system possible. Common practice is to draw conclusions about …
Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis
Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis
Dissertations & Theses (Open Access)
Tumor cells have heterogeneous genotypes, which drives progression and treatment resistance. Such genetic intratumor heterogeneity plays a role in the process of clonal evolution that underlies tumor progression and treatment resistance. Single-cell DNA sequencing is a promising experimental method for studying intratumor heterogeneity, but brings unique statistical challenges in interpreting the resulting data. Researchers lack methods to determine whether sufficiently many cells have been sampled from a tumor. In addition, there are no proven computational methods for determining the ploidy of a cell, a necessary step in the determination of copy number. In this work, software for calculating probabilities from …
Network Structure And Dynamics Of Biological Systems, Deena R. Schmidt
Network Structure And Dynamics Of Biological Systems, Deena R. Schmidt
Annual Symposium on Biomathematics and Ecology Education and Research
No abstract provided.
Topology And Dynamics Of Gene Regulatory Networks: A Meta-Analysis, Claus Kadelka
Topology And Dynamics Of Gene Regulatory Networks: A Meta-Analysis, Claus Kadelka
Biology and Medicine Through Mathematics Conference
No abstract provided.
Do Metabolic Networks Follow A Power Law? A Psamm Analysis, Ryan Geib, Lubos Thoma, Ying Zhang
Do Metabolic Networks Follow A Power Law? A Psamm Analysis, Ryan Geib, Lubos Thoma, Ying Zhang
Senior Honors Projects
Inspired by the landmark paper “Emergence of Scaling in Random Networks” by Barabási and Albert, the field of network science has focused heavily on the power law distribution in recent years. This distribution has been used to model everything from the popularity of sites on the World Wide Web to the number of citations received on a scientific paper. The feature of this distribution is highlighted by the fact that many nodes (websites or papers) have few connections (internet links or citations) while few “hubs” are connected to many nodes. These properties lead to two very important observed effects: the …
Recurrent Neural Networks And Their Applications To Rna Secondary Structure Inference, Devin Willmott
Recurrent Neural Networks And Their Applications To Rna Secondary Structure Inference, Devin Willmott
Theses and Dissertations--Mathematics
Recurrent neural networks (RNNs) are state of the art sequential machine learning tools, but have difficulty learning sequences with long-range dependencies due to the exponential growth or decay of gradients backpropagated through the RNN. Some methods overcome this problem by modifying the standard RNN architecure to force the recurrent weight matrix W to remain orthogonal throughout training. The first half of this thesis presents a novel orthogonal RNN architecture that enforces orthogonality of W by parametrizing with a skew-symmetric matrix via the Cayley transform. We present rules for backpropagation through the Cayley transform, show how to deal with the Cayley …
Linking Taxonomic Diversity And Trophic Function: A Graph-Based Theoretical Approach, Marcella M. Jurotich, Kaitlyn Dougherty, Barbara Hayford, Sally Clark
Linking Taxonomic Diversity And Trophic Function: A Graph-Based Theoretical Approach, Marcella M. Jurotich, Kaitlyn Dougherty, Barbara Hayford, Sally Clark
Transactions of the Nebraska Academy of Sciences and Affiliated Societies
The purpose of this study is to develop a novel, visual method in analyzing complex functional trait data in freshwater ecology. We focus on macroinvertebrates in stream ecosystems under a gradient of habitat degradation and employ a combination of taxonomic and functional trait diversity analyses. Then we use graph theory to link changes in functional trait diversity to taxonomic richness and habitat degradation. We test the hypotheses that: 1) taxonomic diversity and trophic functional trait diversity both decrease with increased habitat degradation; 2) loss of taxa leads to a decrease in trophic function as visualized using a bipartite graph; and …
Network Analytics For The Mirna Regulome And Mirna-Disease Interactions, Joseph Jayakar Nalluri
Network Analytics For The Mirna Regulome And Mirna-Disease Interactions, Joseph Jayakar Nalluri
Theses and Dissertations
miRNAs are non-coding RNAs of approx. 22 nucleotides in length that inhibit gene expression at the post-transcriptional level. By virtue of this gene regulation mechanism, miRNAs play a critical role in several biological processes and patho-physiological conditions, including cancers. miRNA behavior is a result of a multi-level complex interaction network involving miRNA-mRNA, TF-miRNA-gene, and miRNA-chemical interactions; hence the precise patterns through which a miRNA regulates a certain disease(s) are still elusive. Herein, I have developed an integrative genomics methods/pipeline to (i) build a miRNA regulomics and data analytics repository, (ii) create/model these interactions into networks and use optimization techniques, motif …
Dynamics Of Gene Networks In Cancer Research, Paul Scott
Dynamics Of Gene Networks In Cancer Research, Paul Scott
Electronic Theses and Dissertations
Cancer prevention treatments are being researched to see if an optimized treatment schedule would decrease the likelihood of a person being diagnosed with cancer. To do this we are looking at genes involved in the cell cycle and how they interact with one another. Through each gene expression during the life of a normal cell we get an understanding of the gene interactions and test these against those of a cancerous cell. First we construct a simplified network model of the normal gene network. Once we have this model we translate it into a transition matrix and force changes on …
Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang
Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang
COBRA Preprint Series
Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for …
Evolution Of Mobile Promoters In Prokaryotic Genomes., Mahnaz Rabbani
Evolution Of Mobile Promoters In Prokaryotic Genomes., Mahnaz Rabbani
Electronic Thesis and Dissertation Repository
Mobile genetic elements are important factors in evolution, and greatly influence the structure of genomes, facilitating the development of new adaptive characteristics. The dynamics of these mobile elements can be described using various mathematical and statistical models. In this thesis, we focus on a specific category of mobile genetic elements, i.e. mobile promoters, which are mobile regions of DNA that initiate the transcription of genes. We present a class of mathematical models for the evolution of mobile promoters in prokaryotic genomes, based on data obtained from available sequenced genomes. Our novel location-based model incorporates two biologically meaningful regions of the …
Bioinformatic Game Theory And Its Application To Cluster Multi-Domain Proteins, Brittney Keel
Bioinformatic Game Theory And Its Application To Cluster Multi-Domain Proteins, Brittney Keel
Department of Mathematics: Dissertations, Theses, and Student Research
The exact evolutionary history of any set of biological sequences is unknown, and all phylogenetic reconstructions are approximations. The problem becomes harder when one must consider a mix of vertical and lateral phylogenetic signals. In this dissertation we propose a game-theoretic approach to clustering biological sequences and analyzing their evolutionary histories. In this context we use the term evolution as a broad descriptor for the entire set of mechanisms driving the inherited characteristics of a population. The key assumption in our development is that evolution tries to accommodate the competing forces of selection, of which the conservation force seeks to …
Modeling Neurovascular Coupling From Clustered Parameter Sets For Multimodal Eeg-Nirs, M. Tanveer Talukdar, H. Robert Frost, Solomon G. G. Diamond
Modeling Neurovascular Coupling From Clustered Parameter Sets For Multimodal Eeg-Nirs, M. Tanveer Talukdar, H. Robert Frost, Solomon G. G. Diamond
Dartmouth Scholarship
Despite significant improvements in neuroimaging technologies and analysis methods, the fundamental relationship between local changes in cerebral hemodynamics and the underlying neural activity remains largely unknown. In this study, a data driven approach is proposed for modeling this neurovascular coupling relationship from simultaneously acquired electroencephalographic (EEG) and near-infrared spectroscopic (NIRS) data. The approach uses gamma transfer functions to map EEG spectral envelopes that reflect time-varying power variations in neural rhythms to hemodynamics measured with NIRS during median nerve stimulation. The approach is evaluated first with simulated EEG-NIRS data and then by applying the method to experimental EEG-NIRS data measured from …
Epistasis In Predator-Prey Relationships, Iuliia Inozemtseva
Epistasis In Predator-Prey Relationships, Iuliia Inozemtseva
Electronic Theses and Dissertations
Epistasis is the interaction between two or more genes to control a single phenotype. We model epistasis of the prey in a two-locus two-allele problem in a basic predator- prey relationship. The resulting model allows us to examine both population sizes as well as genotypic and phenotypic frequencies. In the context of several numerical examples, we show that if epistasis results in an undesirable or desirable phenotype in the prey by making the particular genotype more or less susceptible to the predator or dangerous to the predator, elimination of undesirable phenotypes and then genotypes occurs.
On The Global Stability Of A Generalized Cholera Epidemiological Model, Yuanji Cheng, Jin Wang, Xiuxiang Yang
On The Global Stability Of A Generalized Cholera Epidemiological Model, Yuanji Cheng, Jin Wang, Xiuxiang Yang
Mathematics & Statistics Faculty Publications
In this paper, we conduct a careful global stability analysis for a generalized cholera epidemiological model originally proposed in [J. Wang and S. Liao, A generalized cholera model and epidemic/endemic analysis, J. Biol. Dyn. 6 (2012), pp. 568-589]. Cholera is a water-and food-borne infectious disease whose dynamics are complicated by the multiple interactions between the human host, the pathogen, and the environment. Using the geometric approach, we rigorously prove the endemic global stability for the cholera model in three-dimensional (when the pathogen component is a scalar) and four-dimensional (when the pathogen component is a vector) systems. This work unifies the …
Stability Analysis And Application Of A Mathematical Cholera Model, Shu Liao, Jim Wang
Stability Analysis And Application Of A Mathematical Cholera Model, Shu Liao, Jim Wang
Mathematics & Statistics Faculty Publications
In this paper, we conduct a dynamical analysis of the deterministic cholera model proposed in [9]. We study the stability of both the disease-free and endemic equilibria so as to explore the complex epidemic and endemic dynamics of the disease. We demonstrate a real-world application of this model by investigating the recent cholera outbreak in Zimbabwe. Meanwhile, we present numerical simulation results to verify the analytical predictions.
Preliminary Analysis Of An Agent-Based Model For A Tick-Borne Disease, Holly Gaff
Preliminary Analysis Of An Agent-Based Model For A Tick-Borne Disease, Holly Gaff
Biological Sciences Faculty Publications
Ticks have a unique life history including a distinct set of life stages and a single blood meal per life stage. This makes tick-host interactions more complex from a mathematical perspective. In addition, any model of these interactions must involve a significant degree of stochasticity on the individual tick level. In an attempt to quantify these relationships, I have developed an individual-based model of the interactions between ticks and their hosts as well as the transmission of tick-borne disease between the two populations. The results from this model are compared with those from previously published differential equation based population models. …
Computational Biology, Harvey Greenberg, Allen Holder
Computational Biology, Harvey Greenberg, Allen Holder
Mathematical Sciences Technical Reports (MSTR)
Computational biology is an interdisciplinary field that applies the techniques of computer science, applied mathematics, and statistics to address biological questions. OR is also interdisciplinary and applies the same mathematical and computational sciences, but to decision-making problems. Both focus on developing mathematical models and designing algorithms to solve them. Models in computational biology vary in their biological domain and can range from the interactions of genes and proteins to the relationships among organisms and species.
G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg
G-Lattices For An Unrooted Perfect Phylogeny, Monica Grigg
Mathematical Sciences Technical Reports (MSTR)
We look at the Pure Parsimony problem and the Perfect Phylogeny Haplotyping problem. From the Pure Parsimony problem we consider structures of genotypes called g-lattices. These structures either provide solutions or give bounds to the pure parsimony problem. In particular, we investigate which of these structures supports an unrooted perfect phylogeny, a condition that adds biological interpretation. By understanding which g-lattices support an unrooted perfect phylogeny, we connect two of the standard biological inference rules used to recreate how genetic diversity propagates across generations.
Model-Based Clustering Of Methylation Array Data: A Recursive-Partitioning Algorithm For High-Dimensional Data Arising As A Mixture Of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, Karl T. Kelsey
Model-Based Clustering Of Methylation Array Data: A Recursive-Partitioning Algorithm For High-Dimensional Data Arising As A Mixture Of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, Karl T. Kelsey
Harvard University Biostatistics Working Paper Series
No abstract provided.
Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li
Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li
Harvard University Biostatistics Working Paper Series
Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time …
Cluster Analysis Of Genomic Data With Applications In R, Katherine S. Pollard, Mark J. Van Der Laan
Cluster Analysis Of Genomic Data With Applications In R, Katherine S. Pollard, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in R. We discuss statistical issues and methods in choosing the number of clusters, the choice of clustering algorithm, and the choice of dissimilarity matrix. In particular, we illustrate how the bootstrap can be employed as a statistical method in cluster analysis to establish the reproducibility of the clusters and the overall variability of the followed procedure. We also show how to visualize a clustering result by plotting ordered dissimilarity matrices in R. We present a new R package, hopach, which implements the hybrid clustering method, …
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
One of the benefits of profiling of cancer samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Such subgroups have typically been found in microarray data using hierarchical clustering. A major problem in interpretation of the output is determining the number of clusters. We approach the problem of determining disease subtypes using mixture models. A novel estimation procedure of the parameters in the mixture model is developed based on a combination of random projections and the expectation-maximization algorithm. Because the approach is probabilistic, our approach provides a measure for the number of true clusters …
Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman
Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman
Bioconductor Project Working Papers
A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. We discuss different approaches to this task and illustrate how they can be applied using software from the Bioconductor Project. A central problem is the high dimensionality of gene expression space, which prohibits a comprehensive statistical analysis without focusing on particular aspects of the joint distribution of the genes expression levels. Possible strategies are to do univariate gene-by-gene analysis, and to perform data-driven nonspecific filtering of genes before the actual statistical analysis. …
Statistical Analyses And Reproducible Research, Robert Gentleman, Duncan Temple Lang
Statistical Analyses And Reproducible Research, Robert Gentleman, Duncan Temple Lang
Bioconductor Project Working Papers
For various reasons, it is important, if not essential, to integrate the computations and code used in data analyses, methodological descriptions, simulations, etc. with the documents that describe and rely on them. This integration allows readers to both verify and adapt the statements in the documents. Authors can easily reproduce them in the future, and they can present the document's contents in a different medium, e.g. with interactive controls. This paper describes a software framework for authoring and distributing these integrated, dynamic documents that contain text, code, data, and any auxiliary content needed to recreate the computations. The documents are …
Bioconductor: Open Software Development For Computational Biology And Bioinformatics, Robert C. Gentleman, Vincent J. Carey, Douglas J. Bates, Benjamin M. Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Guenther Sawitzki, Colin Smith, Gordon K. Smyth, Luke Tierney, Yee Hwa Yang, Jianhua Zhang
Bioconductor: Open Software Development For Computational Biology And Bioinformatics, Robert C. Gentleman, Vincent J. Carey, Douglas J. Bates, Benjamin M. Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Guenther Sawitzki, Colin Smith, Gordon K. Smyth, Luke Tierney, Yee Hwa Yang, Jianhua Zhang
Bioconductor Project Working Papers
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. We detail some of the design decisions, software paradigms and operational strategies that have allowed a small number of researchers to provide a wide variety of innovative, extensible, software solutions in a relatively short time. The use of an object oriented programming paradigm, the adoption and development of a software package system, designing by contract, distributed development and collaboration with other projects are elements of this project's success. Individually, each of these concepts are useful and important but when combined they have …
Computational Protein Biomarker Prediction: A Case Study For Prostate Cancer, Michael Wagner, Dayanand N. Naik, Alex Pothen, Srinivas Kasukurti, Raghu Ram Devineni, Bao-Ling Adam, O. John Semmes, George L. Wright Jr.
Computational Protein Biomarker Prediction: A Case Study For Prostate Cancer, Michael Wagner, Dayanand N. Naik, Alex Pothen, Srinivas Kasukurti, Raghu Ram Devineni, Bao-Ling Adam, O. John Semmes, George L. Wright Jr.
Mathematics & Statistics Faculty Publications
Background: Recent technological advances in mass spectrometry pose challenges in computational mathematics and statistics to process the mass spectral data into predictive models with clinical and biological significance. We discuss several classification-based approaches to finding protein biomarker candidates using protein profiles obtained via mass spectrometry, and we assess their statistical significance. Our overall goal is to implicate peaks that have a high likelihood of being biologically linked to a given disease state, and thus to narrow the search for biomarker candidates.
Results: Thorough cross-validation studies and randomization tests are performed on a prostate cancer dataset with over 300 patients, obtained …