Open Access. Powered by Scholars. Published by Universities.®

Molecular Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

SelectedWorks

Biostatistics

Genomics

Publication Year

Articles 1 - 8 of 8

Full-Text Articles in Molecular Biology

Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do Jan 2012

Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do

Jeffrey S. Morris

Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.

Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …


Alternative Probeset Definitions For Combining Microarray Data Across Studies Using Different Versions Of Affymetrix Oligonucleotide Arrays, Jeffrey S. Morris, Chunlei Wu, Kevin R. Coombes, Keith A. Baggerly, Jing Wang, Li Zhang Dec 2006

Alternative Probeset Definitions For Combining Microarray Data Across Studies Using Different Versions Of Affymetrix Oligonucleotide Arrays, Jeffrey S. Morris, Chunlei Wu, Kevin R. Coombes, Keith A. Baggerly, Jing Wang, Li Zhang

Jeffrey S. Morris

Many published microarray studies have small to moderate sample sizes, and thus have low statistical power to detect significant relationships between gene expression levels and outcomes of interest. By pooling data across multiple studies, however, we can gain power, enabling us to detect new relationships. This type of pooling is complicated by the fact that gene expression measurements from different microarray platforms are not directly comparable. In this chapter, we discuss two methods for combining information across different versions of Affymetrix oligonucleotide arrays. Each involves a new approach for combining probes on the array into probesets. The first approach involves …


Some Statistical Issues In Microarray Gene Expression Data, Matthew S. Mayo, Byron J. Gajewski, Jeffrey S. Morris Jun 2006

Some Statistical Issues In Microarray Gene Expression Data, Matthew S. Mayo, Byron J. Gajewski, Jeffrey S. Morris

Jeffrey S. Morris

In this paper we discuss some of the statistical issues that should be considered when conducting experiments involving microarray gene expression data. We discuss statistical issues related to preprocessing the data as well as the analysis of the data. Analysis of the data is discussed in three contexts: class comparison, class prediction and class discovery. We also review the methods used in two studies that are using microarray gene expression to assess the effect of exposure to radiofrequency (RF) fields on gene expression. Our intent is to provide a guide for radiation researchers when conducting studies involving microarray gene expression …


Shrinkage Estimation For Sage Data Using A Mixture Dirichlet Prior, Jeffrey S. Morris, Keith A. Baggerly, Kevin R. Coombes Mar 2006

Shrinkage Estimation For Sage Data Using A Mixture Dirichlet Prior, Jeffrey S. Morris, Keith A. Baggerly, Kevin R. Coombes

Jeffrey S. Morris

Serial Analysis of Gene Expression (SAGE) is a technique for estimating the gene expression profile of a biological sample. Any efficient inference in SAGE must be based upon efficient estimates of these gene expression profiles, which consist of the estimated relative abundances for each mRNA species present in the sample. The data from SAGE experiments are counts for each observed mRNA species, and can be modeled using a multinomial distribution with two characteristics: skewness in the distribution of relative abundances and small sample size relative to the dimension. As a result of these characteristics, a given SAGE sample will fail …


An Introduction To High-Throughput Bioinformatics Data, Keith A. Baggerly, Kevin R. Coombes, Jeffrey S. Morris Mar 2006

An Introduction To High-Throughput Bioinformatics Data, Keith A. Baggerly, Kevin R. Coombes, Jeffrey S. Morris

Jeffrey S. Morris

High throughput biological assays supply thousands of measurements per sample, and the sheer amount of related data increases the need for better models to enhance inference. Such models, however, are more effective if they take into account the idiosyncracies associated with the specific methods of measurement: where the numbers come from. We illustrate this point by describing three different measurement platforms: microarrays, serial analysis of gene expression (SAGE), and proteomic mass spectrometry.


Bayesian Mixture Models For Gene Expression And Protein Profiles, Michele Guindani, Kim-Anh Do, Peter Mueller, Jeffrey S. Morris Mar 2006

Bayesian Mixture Models For Gene Expression And Protein Profiles, Michele Guindani, Kim-Anh Do, Peter Mueller, Jeffrey S. Morris

Jeffrey S. Morris

We review the use of semi-parametric mixture models for Bayesian inference in high throughput genomic data. We discuss three specific approaches for microarray data, for protein mass spectrometry experiments, and for SAGE data. For the microarray data and the protein mass spectrometry we assume group comparison experiments, i.e., experiments that seek to identify genes and proteins that are differentially expressed across two biologic conditions of interest. For the SAGE data example we consider inference for a single biologic sample.


Pooling Information Across Different Studies And Oligonucleotide Microarray Chip Types To Identify Prognostic Genes For Lung Cancer., Jeffrey S. Morris, Guosheng Yin, Keith A. Baggerly, Chunlei Wu, Li Zhang Dec 2005

Pooling Information Across Different Studies And Oligonucleotide Microarray Chip Types To Identify Prognostic Genes For Lung Cancer., Jeffrey S. Morris, Guosheng Yin, Keith A. Baggerly, Chunlei Wu, Li Zhang

Jeffrey S. Morris

Our goal in this work is to pool information across microarray studies conducted at different institutions using two different versions of Affymetrix chips to identify genes whose expression levels offer information on lung cancer patients’ survival above and beyond the information provided by readily available clinical covariates. We combine information across chip types by identifying “matching probes” present on both chips, and then assembling them into new probesets based on Unigene clusters. This method yields comparable expression level quantifications across chips without sacrificing much precision or significantly altering the relative ordering of the samples. We fit a series of multivariable …


Bayesian Shrinkage Estimation Of The Relative Abundance Of Mrna Transcripts Using Sage, Jeffrey S. Morris, Keith A. Baggerly, Kevin R. Coombes Mar 2003

Bayesian Shrinkage Estimation Of The Relative Abundance Of Mrna Transcripts Using Sage, Jeffrey S. Morris, Keith A. Baggerly, Kevin R. Coombes

Jeffrey S. Morris

Serial analysis of gene expression (SAGE) is a technology for quantifying gene expression in biological tissue that yields count data that can be modeled by a multinomial distribution with two characteristics: skewness in the relative frequencies and small sample size relative to the dimension. As a result of these characteristics, a given SAGE sample may fail to capture a large number of expressed mRNA species present in the tissue. Empirical estimators of mRNA species’ relative abundance effectively ignore these missing species, and as a result tend to overestimate the abundance of the scarce observed species comprising a vast majority of …