Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

PDF

Johns Hopkins University, Dept. of Biostatistics Working Papers

Series

Genetics

Publication Year

Articles 1 - 7 of 7

Full-Text Articles in Entire DC Network

Trio Logic Regression - Detection Of Snp - Snp Interactions In Case-Parent Trios, Qing Li, Thomas A. Louis, M. Daniele Fallin, Ingo Ruczinski Jul 2009

Trio Logic Regression - Detection Of Snp - Snp Interactions In Case-Parent Trios, Qing Li, Thomas A. Louis, M. Daniele Fallin, Ingo Ruczinski

Johns Hopkins University, Dept. of Biostatistics Working Papers

Statistical approaches to evaluate higher order SNP-SNP and SNP-environment interactions are critical in genetic association studies, as susceptibility to complex disease is likely to be related to the interaction of multiple SNPs and environmental factors. Logic regression (Kooperberg et al., 2001; Ruczinski et al., 2003) is one such approach, where interactions between SNPs and environmental variables are assessed in a regression framework, and interactions become part of the model search space. In this manuscript we extend the logic regression methodology, originally developed for cohort and case-control studies, for studies of trios with affected probands. Trio logic regression accounts for the …


Associaton Tests That Accommodate Genotyping Errors, Ingo Ruczinski, Qing Li, Benilton Carvalho, M. Daniele Fallin, Rafael A. Irizarry, Thomas A. Louis Jan 2009

Associaton Tests That Accommodate Genotyping Errors, Ingo Ruczinski, Qing Li, Benilton Carvalho, M. Daniele Fallin, Rafael A. Irizarry, Thomas A. Louis

Johns Hopkins University, Dept. of Biostatistics Working Papers

High-throughput SNP arrays provide estimates of genotypes for up to one million loci, often used in genome-wide association studies. While these estimates are typically very accurate, genotyping errors do occur, which can influence in particular the most extreme test statistics and p-values. Estimates for the genotype uncertainties are also available, although typically ignored. In this manuscript, we develop a framework to incorporate these genotype uncertainties in case-control studies for any genetic model. We verify that using the assumption of a “local alternative” in the score test is very reasonable for effect sizes typically seen in SNP association studies, and show …


Multiple Diseases In Carrier Probability Estimation: Accounting For Surviving All Cancers Other Than Breast And Ovary In Brcapro, Hormuzd A. Katki, Amanda Blackford, Sining Chen, Giovanni Parmigiani Feb 2007

Multiple Diseases In Carrier Probability Estimation: Accounting For Surviving All Cancers Other Than Breast And Ovary In Brcapro, Hormuzd A. Katki, Amanda Blackford, Sining Chen, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

Mendelian models can predict who carries an inherited deleterious mutation of known disease genes based on family history. For example, the BRCAPRO model is commonly used to identify families who carry mutations of BRCA1 and BRCA2, based on familial breast and ovarian cancers. These models incorporate the age of diagnosis of diseases in relatives and current age or age of death. We develop a rigorous foundation for handling multiple diseases with censoring. We prove that any disease unrelated to mutations can be excluded from the model, unless it is sufficiently common and dependent on a mutation-related disease time. Furthermore, if …


Use Of Hidden Markov Models For Qtl Mapping, Karl W. Broman Dec 2006

Use Of Hidden Markov Models For Qtl Mapping, Karl W. Broman

Johns Hopkins University, Dept. of Biostatistics Working Papers

An important aspect of the QTL mapping problem is the treatment of missing genotype data. If complete genotype data were available, QTL mapping would reduce to the problem of model selection in linear regression. However, in the consideration of loci in the intervals between the available genetic markers, genotype data is inherently missing. Even at the typed genetic markers, genotype data is seldom complete, as a result of failures in the genotyping assays or for the sake of economy (for example, in the case of selective genotyping, where only individuals with extreme phenotypes are genotyped). We discuss the use of …


Poor Performance Of Bootstrap Confidence Intervals For The Location Of A Quantitative Trait Loucs, Ani Manichaikul, Josee Dupuis, Saunak Sen, Karl W. Broman Mar 2006

Poor Performance Of Bootstrap Confidence Intervals For The Location Of A Quantitative Trait Loucs, Ani Manichaikul, Josee Dupuis, Saunak Sen, Karl W. Broman

Johns Hopkins University, Dept. of Biostatistics Working Papers

The aim of many genetic studies is to locate the genomic regions (called quantitative trait loci, QTLs) that contribute to variation in a quantitative trait (such as body weight). Confidence intervals for the locations of QTLs are particularly important for the design of further experiments to identify the gene or genes responsible for the effect. Likelihood support intervals are the most widely used method to obtain confidence intervals for QTL location, but the non-parametric bootstrap has also been recommended. Through extensive computer simulation, we show that bootstrap confidence intervals are poorly behaved and so should not be used in this …


The Role Of An Explicit Causal Framework In Affected Sib Pair Designs With Covariates , Constantine E. Frangakis, Fan Li, Betty Q. Doan Dec 2005

The Role Of An Explicit Causal Framework In Affected Sib Pair Designs With Covariates , Constantine E. Frangakis, Fan Li, Betty Q. Doan

Johns Hopkins University, Dept. of Biostatistics Working Papers

The affected sib/relative pair (ASP/ARP) design is often used with covariates to find genes that can cause a disease in pathways other than through those covariates. However, such "covariates" can themselves have genetic determinants, and the validity of existing methods has so far only been argued under implicit assumptions. We propose an explicit causal formulation of the problem using potential outcomes and principal stratification. The general role of this formulation is to identify and separate the meaning of the different assumptions that can provide valid causal inference in linkage analysis. This separation helps to (a) develop better methods under explicit …


Searching For Differentially Expressed Gene Combinations, Marcel Dettling, Edward Gabrielson, Giovanni Parmigiani Mar 2005

Searching For Differentially Expressed Gene Combinations, Marcel Dettling, Edward Gabrielson, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

Background: Comparison of mRNA expression levels across biological samples is a widely used approach in genomics. Available data-analytic tools for deriving comprehensive lists of differentially expressed genes rely on data summaries formed using each gene in isolation from others. These approaches ignore biological relationships among genes and may miss important biological insight provided by genomics data.

Methods: We propose a fast, easily interpretable and scalable approach for identifying pairs of genes that are differentially expressed across phenotypes or experimental conditions. These are defined as pairs for which there is detectable phenotype discrimination using the joint distribution, but not from either …