Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Life Sciences (11)
- Genetics and Genomics (10)
- Statistical Models (10)
- Genetics (8)
- Bioinformatics (6)
-
- Computational Biology (5)
- Multivariate Analysis (4)
- Statistical Methodology (4)
- Statistical Theory (4)
- Applied Mathematics (2)
- Numerical Analysis and Computation (2)
- Biochemistry, Biophysics, and Structural Biology (1)
- Clinical Trials (1)
- Design of Experiments and Sample Surveys (1)
- Longitudinal Data Analysis and Time Series (1)
- Mathematics (1)
- Molecular Biology (1)
- Keyword
-
- Gene expression (5)
- Differential expression (3)
- Microarray (3)
- Multiple comparisons (3)
- Cross-validation (2)
-
- Empirical Bayes (2)
- Microarrays (2)
- Mixture models (2)
- (Quasi)separation (1)
- Adjust p value (1)
- Affymetric Gene Chip (1)
- Aging (1)
- Bioinformatics (1)
- Biological metadata (1)
- Boosted Decision Trees (1)
- Calibration (1)
- Changepoint (1)
- Cross-validation model selection (1)
- DNA microarrays (1)
- Density estimation (1)
- Derivative estimation (1)
- Differential gene expression (1)
- Endotoxin (1)
- Ensemble of Voters (1)
- Error model (1)
- Expectation-maximization algorithm (1)
- Expression arrays (1)
- False discovery rate (1)
- Firth's procedure (1)
- GC Content (1)
- Publication
- Publication Type
Articles 1 - 19 of 19
Full-Text Articles in Microarrays
Cross-Study Validation And Combined Analysis Of Gene Expression Microarray Data, Elizabeth Garrett-Mayer, Giovanni Parmigiani, Xiaogang Zhong, Leslie Cope, Edward Gabrielson
Cross-Study Validation And Combined Analysis Of Gene Expression Microarray Data, Elizabeth Garrett-Mayer, Giovanni Parmigiani, Xiaogang Zhong, Leslie Cope, Edward Gabrielson
Johns Hopkins University, Dept. of Biostatistics Working Papers
Investigations of transcript levels on a genomic scale using
hybridization-based arrays led to formidable advances in our
understanding of the biology of many human illnesses. At the same time, these investigations have generated controversy, because of the probabilistic nature of the conclusions, and the surfacing of noticeable discrepancies between the results of studies addressing the same biological question. In this article we present simple and effective data analysis and visualization tools for gauging the degree to which
the finding of one study are reproduced by others, and for integrating multiple studies in a single analysis.
We describe these approaches in …
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
One of the benefits of profiling of cancer samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Such subgroups have typically been found in microarray data using hierarchical clustering. A major problem in interpretation of the output is determining the number of clusters. We approach the problem of determining disease subtypes using mixture models. A novel estimation procedure of the parameters in the mixture model is developed based on a combination of random projections and the expectation-maximization algorithm. Because the approach is probabilistic, our approach provides a measure for the number of true clusters …
Program Of Gene Transcription For A Single Differentiating Cell Type During Sporulation In Bacillus Subtilis, Patrick Eichenberger, Masaya Fujita, Shane T. Jensen, Erin M. Conlon, David Z. Rudner, Stephanie T. Want, Caitlin Ferguson, Koki Haga, Txutomu Sato, Jun S. Liu, Richard Losick
Program Of Gene Transcription For A Single Differentiating Cell Type During Sporulation In Bacillus Subtilis, Patrick Eichenberger, Masaya Fujita, Shane T. Jensen, Erin M. Conlon, David Z. Rudner, Stephanie T. Want, Caitlin Ferguson, Koki Haga, Txutomu Sato, Jun S. Liu, Richard Losick
Erin M. Conlon
Asymmetric division during sporulation by Bacillus subtilis generates a mother cell that undergoes a 5-h program of differentiation. The program is governed by a hierarchical cascade consisting of the transcription factors: σE, σK, GerE, GerR, and SpoIIID. The program consists of the activation and repression of 383 genes. The σE factor turns on 262 genes, including those for GerR and SpoIIID. These DNA-binding proteins downregulate almost half of the genes in the σE regulon. In addition, SpoIIID turns on ten genes, including genes involved in the appearance of σK. Next, σK activates 75 additional genes, including that for GerE. This …
The Program Of Gene Transcription For A Single Differentiating Cell Type During Sporulation In Bacillus Subtilis, Patrick Eichenberger, Masaya Fujita, Shane T. Jensen, Erin M. Conlon, David Z. Rudner, Stephanie T. Wang, Caitlin Ferguson, Koki Haga, Tsutomu Sato, Jun S. Liu, Richard Losick
The Program Of Gene Transcription For A Single Differentiating Cell Type During Sporulation In Bacillus Subtilis, Patrick Eichenberger, Masaya Fujita, Shane T. Jensen, Erin M. Conlon, David Z. Rudner, Stephanie T. Wang, Caitlin Ferguson, Koki Haga, Tsutomu Sato, Jun S. Liu, Richard Losick
Erin M. Conlon
Asymmetric division during sporulation by Bacillus subtilis generates a mother cell that undergoes a 5-h program of differentiation. The program is governed by a hierarchical cascade consisting of the transcription factors: σE, σK, GerE, GerR, and SpoIIID. The program consists of the activation and repression of 383 genes. The σE factor turns on 262 genes, including those for GerR and SpoIIID. These DNA-binding proteins downregulate almost half of the genes in the σE regulon. In addition, SpoIIID turns on ten genes, including genes involved in the appearance of σK. Next, σK activates 75 additional genes, including that for GerE. This …
Significance Analysis Of Time Course Microarray Experiments, John D. Storey, Wenzhong Xiao, Jeffrey T. Leek, Ronald G. Tompkins, Ron W. Davis
Significance Analysis Of Time Course Microarray Experiments, John D. Storey, Wenzhong Xiao, Jeffrey T. Leek, Ronald G. Tompkins, Ron W. Davis
UW Biostatistics Working Paper Series
Characterizing the genome-wide dynamic regulation of gene expression is important and will be of much interest in the future. However, there is currently no established method for identifying differentially expressed genes in a time course study. Here we propose a significance method for analyzing time course microarray studies that can be applied to the typical types of comparisons and sampling schemes. This method is applied to two studies on humans. In one study, genes are identified that show differential expression over time in response to in vivo endotoxin administration. Using our method 7409 genes are called significant at a 1% …
The Optimal Confidence Region For A Random Parameter, Hajime Uno, Lu Tian, L.J. Wei
The Optimal Confidence Region For A Random Parameter, Hajime Uno, Lu Tian, L.J. Wei
Harvard University Biostatistics Working Paper Series
Under a two-level hierarchical model, suppose that the distribution of the random parameter is known or can be estimated well. Data are generated via a fixed, but unobservable realization of this parameter. In this paper, we derive the smallest confidence region of the random parameter under a joint Bayesian/frequentist paradigm. On average this optimal region can be much smaller than the corresponding Bayesian highest posterior density region. The new estimation procedure is appealing when one deals with data generated under a highly parallel structure, for example, data from a trial with a large number of clinical centers involved or genome-wide …
Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman
Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman
Bioconductor Project Working Papers
A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. We discuss different approaches to this task and illustrate how they can be applied using software from the Bioconductor Project. A central problem is the high dimensionality of gene expression space, which prohibits a comprehensive statistical analysis without focusing on particular aspects of the joint distribution of the genes expression levels. Possible strategies are to do univariate gene-by-gene analysis, and to perform data-driven nonspecific filtering of genes before the actual statistical analysis. …
Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh
Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
Due to the advent of high-throughput genomic technology, it has become possible to globally monitor cellular activities on a genomewide basis. With these new methods, scientists can begin to address important biological questions. One such question involves the identification of replication origins, which are regions in chromosomes where DNA replication is initiated. In addition, one hypothesis regarding replication origins is that their locations are non-random throughout the genome. In this article, we develop methods for identification of and cluster inference regarding replication origins involving genomewide expression data. We compare several nonparametric regression methods for the identification of replication origin locations. …
Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan
Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan
The University of Michigan Department of Biostatistics Working Paper Series
The use of microarray data has become quite commonplace in medical and scientific experiments. We focus here on microarray data generated from cancer studies. It is potentially important for the discovery of biomarkers to identify genes whose expression levels correlate with tumor progression. In this article, we develop statistical procedures for the identification of such genes, which we term tumor progression genes. Two methods are considered in this paper. The first is use of a proportional odds procedure, combined with false discovery rate estimation techniques to adjust for the multiple testing problem. The second method is based on order-restricted estimation …
A Model Based Background Adjustment For Oligonucleotide Expression Arrays, Zhijin Wu, Rafael A. Irizarry, Robert Gentleman, Francisco Martinez Murillo, Forrest Spencer
A Model Based Background Adjustment For Oligonucleotide Expression Arrays, Zhijin Wu, Rafael A. Irizarry, Robert Gentleman, Francisco Martinez Murillo, Forrest Spencer
Johns Hopkins University, Dept. of Biostatistics Working Papers
High density oligonucleotide expression arrays are widely used in many areas of biomedical research. Affymetrix GeneChip arrays are the most popular. In the Affymetrix system, a fair amount of further pre-processing and data reduction occurs following the image processing step. Statistical procedures developed by academic groups have been successful at improving the default algorithms provided by the Affymetrix system. In this paper we present a solution to one of the pre-processing steps, background adjustment, based on a formal statistical framework. Our solution greatly improves the performance of the technology in various practical applications.
Affymetrix GeneChip arrays use short oligonucleotides to …
Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman
Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman
Bioconductor Project Working Papers
The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in …
Covariate Adjustment In The Analysis Of Microarray Data From Clinical Studies, Debashis Ghosh, Arul Chinnaiyan
Covariate Adjustment In The Analysis Of Microarray Data From Clinical Studies, Debashis Ghosh, Arul Chinnaiyan
The University of Michigan Department of Biostatistics Working Paper Series
There is tremendous scientific interest in the analysis of gene expression data in clinical settings, such as oncology. In this paper, we describe the importance of adjusting for confounders and other prognostic factors in order to select for differentially expressed genes for followup validation studies. We develop two approaches to the analysis of microarray data in nonrandomized clinical settings. The first is an extension of the current significance analysis of microarray procedures, where other covariates are taken into account. The second is a novel covariate-adjusted regression modelling based on the receiver operating characteristic curve for the analysis of gene expression …
Regulatory Motif Finding By Logic Regression, Sunduz Keles, Mark J. Van Der Laan, Chris Vulpe
Regulatory Motif Finding By Logic Regression, Sunduz Keles, Mark J. Van Der Laan, Chris Vulpe
U.C. Berkeley Division of Biostatistics Working Paper Series
Multiple transcription factors coordinately control transcriptional regulation of genes in eukaryotes. Although multiple computational methods consider the identification of individual transcription factor binding sites (TFBSs), very few focus on the interactions between these sites. We consider finding transcription factor binding sites and their context specific interactions using microarray gene expression data. We devise a hybrid approach called LogicMotif composed of a TFBS identification method combined with the new regression methodology logic regression of Ruczinski et al. (2003). LogicMotif has two steps: First potential binding sites are identified from transcription control regions of genes of interest. Various available methods can be …
A Statistical Method For Constructing Transcriptional Regulatory Networks Using Gene Expression And Sequence Data , Biao Xing, Mark J. Van Der Laan
A Statistical Method For Constructing Transcriptional Regulatory Networks Using Gene Expression And Sequence Data , Biao Xing, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Transcriptional regulation is one of the most important means of gene regulation. Uncovering transcriptional regulatory network helps us to understand the complex cellular process. In this paper, we describe a comprehensive statistical approach for constructing the transcriptional regulatory network using data of gene expression, promoter sequence, and transcription factor binding sites. Our simulation studies show that the overall and false positive error rates in the estimated transcriptional regulatory network are expected to be small if the systematic noise in the constructed feature matrix is small. Our analysis based on 658 microarray experiments on yeast gene expression programs and 46 transcription …
Error Models For Microarray Intensities, Wolfgang Huber, Anja Von Heydebreck, Martin Vingron
Error Models For Microarray Intensities, Wolfgang Huber, Anja Von Heydebreck, Martin Vingron
Bioconductor Project Working Papers
We derive the additive-multiplicative error model for microarray intensities, and describe two applications. For the detection of differentially expressed genes, we obtain a statistic whose variance is approximately independent of the mean intensity. For the post hoc calibration (normalization) of data with respect to experimental factors, we describe a method for parameter estimation.
Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh
Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
The use of DNA microarrays has become quite popular in many scientific and medical disciplines, such as in cancer research. One common goal of these studies is to determine which genes are differentially expressed between cancer and healthy tissue, or more generally, between two experimental conditions. A major complication in the molecular profiling of tumors using gene expression data is that the data represent a combination of tumor and normal cells. Much of the methodology developed for assessing differential expression with microarray data has assumed that tissue samples are homogeneous. In this article, we outline a general framework for determining …
Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau
Optimal Sample Size For Multiple Testing: The Case Of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, Judith Rousseau
Johns Hopkins University, Dept. of Biostatistics Working Papers
We consider the choice of an optimal sample size for multiple comparison problems. The motivating application is the choice of the number of microarray experiments to be carried out when learning about differential gene expression. However, the approach is valid in any application that involves multiple comparisons in a large number of hypothesis tests. We discuss two decision problems in the context of this setup: the sample size selection and the decision about the multiple comparisons. We adopt a decision theoretic approach,using loss functions that combine the competing goals of discovering as many ifferentially expressed genes as possible, while keeping …
Calibrating Observed Differential Gene Expression For The Multiplicity Of Genes On The Array, Yingye Zheng, Margaret S. Pepe
Calibrating Observed Differential Gene Expression For The Multiplicity Of Genes On The Array, Yingye Zheng, Margaret S. Pepe
UW Biostatistics Working Paper Series
In a gene expression array study, the expression levels of thousands of genes are monitored simultaneously across various biological conditions on a small set of subjects. One goal of such studies is to explore a large pool of genes in order to select a subset of genes that appear to be differently expressed for further investigation. Of particular interest here is how to select the top k genes once genes are ranked based on their evidence for differential expression in two tissue types. We consider statistical methods that provide a more rigorous and intuitively appealing selection process for k. We …
Evaluation Of Multiple Models To Distinguish Closely Related Forms Of Disease Using Dna Microarray Data: An Application To Multiple Myeloma, Johanna S. Hardin, Michael Waddell, C. David Page, Fenghuang Zhan, Bart Barlogie, John Shaughnessy, John J. Crowley
Evaluation Of Multiple Models To Distinguish Closely Related Forms Of Disease Using Dna Microarray Data: An Application To Multiple Myeloma, Johanna S. Hardin, Michael Waddell, C. David Page, Fenghuang Zhan, Bart Barlogie, John Shaughnessy, John J. Crowley
Pomona Faculty Publications and Research
Motivation: Standard laboratory classification of the plasma cell dyscrasia monoclonal gammopathy of undetermined significance (MGUS) and the overt plasma cell neoplasm multiple myeloma (MM) is quite accurate, yet, for the most part, biologically uninformative. Most, if not all, cancers are caused by inherited or acquired genetic mutations that manifest themselves in altered gene expression patterns in the clonally related cancer cells. Microarray technology allows for qualitative and quantitative measurements of the expression levels of thousands of genes simultaneously, and it has now been used both to classify cancers that are morphologically indistinguishable and to predict response to therapy. It is …