Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Entire DC Network

Detection Of Recurrent Copy Number Alterations In The Genome: A Probabilistic Approach, Oscar M. Rueda, Ramon Diaz-Uriarte Nov 2008

Detection Of Recurrent Copy Number Alterations In The Genome: A Probabilistic Approach, Oscar M. Rueda, Ramon Diaz-Uriarte

COBRA Preprint Series

Copy number variation (CNV) in genomic DNA is linked to a variety of human diseases (including cancer, HIV acquisition, autoimmune and neurodegenerative diseases), and array-based CGH (aCGH) is currently the main technology to locate CNVs. Several methods can analyze aCGH data at the single sample level, but disease-critical genes are more likely to be found in regions that are common or recurrent among samples. Unfortunately, defining recurrent CNV regions remains a challenge. Moreover, the heterogeneous nature of many diseases requires that we search for CNVs that affect only some subsets of the samples (without prior knowledge of which regions and …


Finding Recurrent Regions Of Copy Number Variation: A Review, Oscar M. Rueda, Ramon Diaz-Uriarte Nov 2008

Finding Recurrent Regions Of Copy Number Variation: A Review, Oscar M. Rueda, Ramon Diaz-Uriarte

COBRA Preprint Series

Copy number variation (CNV) in genomic DNA is linked to a variety of human diseases, and array-based CGH (aCGH) is currently the main technology to locate CNVs. Although many methods have been developed to analyze aCGH from a single array/subject, disease-critical genes are more likely to be found in regions that are common or recurrent among subjects. Unfortunately, finding recurrent CNV regions remains a challenge. We review existing methods for the identification of recurrent CNV regions. The working definition of ``common'' or ``recurrent'' region differs between methods, leading to approaches that use different types of input (discretized output from a …


The Strength Of Statistical Evidence For Composite Hypotheses With An Application To Multiple Comparisons, David R. Bickel Nov 2008

The Strength Of Statistical Evidence For Composite Hypotheses With An Application To Multiple Comparisons, David R. Bickel

COBRA Preprint Series

The strength of the statistical evidence in a sample of data that favors one composite hypothesis over another may be quantified by the likelihood ratio using the parameter value consistent with each hypothesis that maximizes the likelihood function. Unlike the p-value and the Bayes factor, this measure of evidence is coherent in the sense that it cannot support a hypothesis over any hypothesis that it entails. Further, when comparing the hypothesis that the parameter lies outside a non-trivial interval to the hypotheses that it lies within the interval, the proposed measure of evidence almost always asymptotically favors the correct hypothesis …


A Network-Constrained Empirical Bayes Method For Analysis Of Genomic Data, Caiyan Li, Zhi Wei, Hongzhe Li Oct 2008

A Network-Constrained Empirical Bayes Method For Analysis Of Genomic Data, Caiyan Li, Zhi Wei, Hongzhe Li

UPenn Biostatistics Working Papers

Empirical Bayes methods are widely used in the analysis of microarray gene expression data in order to identify the differentially expressed genes or genes that are associated with other general phenotypes. Available methods often assume that genes are independent. However, genes are expected to function interactively and to form molecular modules to affect the phenotypes. In order to account for regulatory dependency among genes, we propose in this paper a network-constrained empirical Bayes method for analyzing genomic data in the framework of general linear models, where the dependency of genes is modeled by a discrete Markov random field model defined …


Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin Jun 2008

Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein Jun 2008

A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein

Harvard University Biostatistics Working Paper Series

No abstract provided.


Model-Based Clustering Of Methylation Array Data: A Recursive-Partitioning Algorithm For High-Dimensional Data Arising As A Mixture Of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, Karl T. Kelsey Jun 2008

Model-Based Clustering Of Methylation Array Data: A Recursive-Partitioning Algorithm For High-Dimensional Data Arising As A Mixture Of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, Karl T. Kelsey

Harvard University Biostatistics Working Paper Series

No abstract provided.


Incorporation Of Genetic Pathway Information Into Analysis Of Multivariate Gene Expression Data, Zhi Wei, Jane E. Minturn, Eric Rappaport, Garrett Brodeur, Hongzhe Li Apr 2008

Incorporation Of Genetic Pathway Information Into Analysis Of Multivariate Gene Expression Data, Zhi Wei, Jane E. Minturn, Eric Rappaport, Garrett Brodeur, Hongzhe Li

UPenn Biostatistics Working Papers

Abstract: Multivariate microarray gene expression data are commonly collected to study the genomic responses under ordered conditions such as over increasing/decreasing dose levels or over time during biological processes. One important question from such multivariate gene expression experiments is to identify genes that show different expression patterns over treatment dosages or over time and pathways that are perturbed during a given biological process. In this paper, we develop a hidden Markov random field model for multivariate expression data in order to identify genes and subnetworks that are related to biological processes, where the dependency of the differential expression patterns of …


Likelihood Estimation Of Conjugacy Relationships In Linear Models With Applications To High-Throughput Genomics, Brian S. Caffo, Liu Dongmei, Robert Scharpf, Giovanni Parmigiani Apr 2008

Likelihood Estimation Of Conjugacy Relationships In Linear Models With Applications To High-Throughput Genomics, Brian S. Caffo, Liu Dongmei, Robert Scharpf, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

In the simultaneous estimation of a large number of related quantities, multilevel models provide a formal mechanism for efficiently making use of the ensemble of information for deriving individual estimates. In this article we investigate the ability of the likelihood to identify the relationship between signal and noise in multilevel linear mixed models. Specifically, we consider the ability of the likelihood to diagnose conjugacy or independence between the signals and noises. Our work was motivated by the analysis of data from high-throughput experiments in genomics. The proposed model leads to a more flexible family. However, we further demonstrate that adequately …


Empirical Null And False Discovery Rate Inference For Exponential Families, Armin Schwartzman Feb 2008

Empirical Null And False Discovery Rate Inference For Exponential Families, Armin Schwartzman

Harvard University Biostatistics Working Paper Series

No abstract provided.


Assessing The Role Of Multi-Protein Complexes In Determining Phenotype, Nolwenn Le Meur, Robert Gentleman Jan 2008

Assessing The Role Of Multi-Protein Complexes In Determining Phenotype, Nolwenn Le Meur, Robert Gentleman

Bioconductor Project Working Papers

Understanding regulatory mechanisms in complex biological systems is an important challenge, in particular to understand disease mechanisms, and to discover new therapies and drugs. In this paper, we consider the important question of cellular regulation of phenotype. Using single gene deletion data, we address the problem of linking a phenotype to underlying functional roles in the organism and provide a sound computational and statistical paradigm that can be extended to address more complex experimental settings such as multiple deletions. We apply the proposed approaches to publicly available data sets to demonstrate strong evidence for the involvement of multi-protein complexes in …


Design And Analysis Issues In Genome-Wide Somatic Mutation Studies Of Cancer, Giovanni Parmigiani, Simina Boca, Jimmy Lin, Kenneth W. Kinzler, Victor E. Velculescu, Bert Vogelstein Jan 2008

Design And Analysis Issues In Genome-Wide Somatic Mutation Studies Of Cancer, Giovanni Parmigiani, Simina Boca, Jimmy Lin, Kenneth W. Kinzler, Victor E. Velculescu, Bert Vogelstein

Johns Hopkins University, Dept. of Biostatistics Working Papers

The availability of the human genome sequence and progress in sequencing and bioinformatic technologies have enabled genome-wide investigation of somatic mu- tations in human cancers. This article briefly reviews challenges arising in the statistical analysis of mutational data of this kind. A first challenge is that of designing studies that efficiently allocate sequencing resources. We show that this can be addressed by two-stage designs, and demonstrate via simulations that even relatively small studies can produce lists of candidate cancer genes that are highly informative for future research efforts. A second challenge is to distinguish mutated genes that are selected for …