Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Physical Sciences and Mathematics

Multiple Testing And Data Adaptive Regression: An Application To Hiv-1 Sequence Data, Merrill D. Birkner, Sandra E. Sinisi, Mark J. Van Der Laan Oct 2004

Multiple Testing And Data Adaptive Regression: An Application To Hiv-1 Sequence Data, Merrill D. Birkner, Sandra E. Sinisi, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological insights regarding the replication ability of HIV-1. Determining specific target codons on the viral strand will facilitate the manufacturing of target specific antiretrovirals. Various algorithmic and analysis techniques can be applied to this application. We propose using multiple testing to find codons which have significant univariate associations with replication capacity of the virus. We also propose using a data adaptive multiple regression algorithm to obtain multiple predictions of viral replication capacity based on an entire mutant/non-mutant sequence profile. The data set to which these techniques …


Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman Jun 2004

Differential Expression With The Bioconductor Project, Anja Von Heydebreck, Wolfgang Huber, Robert Gentleman

Bioconductor Project Working Papers

A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. We discuss different approaches to this task and illustrate how they can be applied using software from the Bioconductor Project. A central problem is the high dimensionality of gene expression space, which prohibits a comprehensive statistical analysis without focusing on particular aspects of the joint distribution of the genes expression levels. Possible strategies are to do univariate gene-by-gene analysis, and to perform data-driven nonspecific filtering of genes before the actual statistical analysis. …


Multiple Testing Methods For Chip-Chip High Density Oligonucleotide Array Data, Sunduz Keles, Mark J. Van Der Laan, Sandrine Dudoit, Simon E. Cawley Jun 2004

Multiple Testing Methods For Chip-Chip High Density Oligonucleotide Array Data, Sunduz Keles, Mark J. Van Der Laan, Sandrine Dudoit, Simon E. Cawley

U.C. Berkeley Division of Biostatistics Working Paper Series

Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription factors along human chromosomes 21 and 22 using ChIP-Chip experiments. ChIP-Chip experiments are a new approach to the genome-wide identification of transcription factor binding sites and consist of chromatin (Ch) immunoprecipitation (IP) of transcription factor-bound genomic DNA followed by high density oligonucleotide hybridization (Chip) of the IP-enriched DNA. We investigate the ChIP-Chip data structure and propose methods for inferring the location of transcription factor binding sites from these data. The proposed methods involve testing for each probe whether it is part of a bound sequence …


Multiple Testing. Part Iii. Procedures For Control Of The Generalized Family-Wise Error Rate And Proportion Of False Positives, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard Jan 2004

Multiple Testing. Part Iii. Procedures For Control Of The Generalized Family-Wise Error Rate And Proportion Of False Positives, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard

U.C. Berkeley Division of Biostatistics Working Paper Series

The accompanying articles by Dudoit et al. (2003b) and van der Laan et al. (2003) provide single-step and step-down resampling-based multiple testing procedures that asymptotically control the family-wise error rate (FWER) for general null hypotheses and test statistics. The proposed procedures fundamentally differ from existing approaches in the choice of null distribution for deriving cut-offs for the test statistics and are shown to provide asymptotic control of the FWER under general data generating distributions, without the need for conditions such as subset pivotality. In this article, we show that any multiple testing procedure (asymptotically) controlling the FWER at level alpha …