Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman Jun 2004

Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman

Bioconductor Project Working Papers

A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. We discuss different approaches to this task and illustrate how they can be applied using software from the Bioconductor Project. A central problem is the high dimensionality of gene expression space, which prohibits a comprehensive statistical analysis without focusing on particular aspects of the joint distribution of the genes expression levels. Possible strategies are to do univariate gene-by-gene analysis, and to perform data-driven nonspecific filtering of genes before the actual statistical analysis. …


A Graph Theoretic Approach To Testing Associations Between Disparate Sources Of Functional Genomic Data, Raji Balasubramanian, Thomas Laframboise, Denise Scholtens, Robert Gentleman Jun 2004

A Graph Theoretic Approach To Testing Associations Between Disparate Sources Of Functional Genomic Data, Raji Balasubramanian, Thomas Laframboise, Denise Scholtens, Robert Gentleman

Bioconductor Project Working Papers

The last few years have seen the advent of high-throughput technologies to analyze various properties of the transcriptome and proteome of several organisms. The congruency of these different data sources, or lack thereof, can shed light on the mechanisms that govern cellular function. A central challenge for bioinformatics research is to develop a unified framework for combining the multiple sources of functional genomics information and testing associations between them, thus obtaining a robust and integrated view of the underlying biology.

We present a graph theoretic approach to test the significance of the association between multiple disparate sources of functional genomics …


Statistical Analyses And Reproducible Research, Robert Gentleman, Duncan Temple Lang May 2004

Statistical Analyses And Reproducible Research, Robert Gentleman, Duncan Temple Lang

Bioconductor Project Working Papers

For various reasons, it is important, if not essential, to integrate the computations and code used in data analyses, methodological descriptions, simulations, etc. with the documents that describe and rely on them. This integration allows readers to both verify and adapt the statements in the documents. Authors can easily reproduce them in the future, and they can present the document's contents in a different medium, e.g. with interactive controls. This paper describes a software framework for authoring and distributing these integrated, dynamic documents that contain text, code, data, and any auxiliary content needed to recreate the computations. The documents are …


Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman May 2004

Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman

Bioconductor Project Working Papers

The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in …


Bioconductor: Open Software Development For Computational Biology And Bioinformatics, Robert C. Gentleman, Vincent J. Carey, Douglas J. Bates, Benjamin M. Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Guenther Sawitzki, Colin Smith, Gordon K. Smyth, Luke Tierney, Yee Hwa Yang, Jianhua Zhang Jan 2004

Bioconductor: Open Software Development For Computational Biology And Bioinformatics, Robert C. Gentleman, Vincent J. Carey, Douglas J. Bates, Benjamin M. Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Guenther Sawitzki, Colin Smith, Gordon K. Smyth, Luke Tierney, Yee Hwa Yang, Jianhua Zhang

Bioconductor Project Working Papers

The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. We detail some of the design decisions, software paradigms and operational strategies that have allowed a small number of researchers to provide a wide variety of innovative, extensible, software solutions in a relatively short time. The use of an object oriented programming paradigm, the adoption and development of a software package system, designing by contract, distributed development and collaboration with other projects are elements of this project's success. Individually, each of these concepts are useful and important but when combined they have …