Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Gene expression (7)
- Genetics (6)
- Aquaculture (5)
- Commercial Fishing (5)
- Fisheries Management (5)
-
- Recreational Fishing (4)
- Model selection (3)
- Counting process (2)
- Cross-validation (2)
- Density estimation (2)
- Family-wise error rate control (2)
- Genome-wide association studies (2)
- Linkage mapping (2)
- Mixture model (2)
- Mixture models (2)
- Multiple comparison (2)
- Multiple comparisons (2)
- Population genetics (2)
- Software (2)
- Survival analysis (2)
- (Quasi)separation (1)
- ACGH data; Moving average; Perturbation method; Gaussian process; Genomic data (1)
- Additive hazards models (1)
- Admixture (1)
- Ancestry (1)
- Annotation metadata; Gene Ontology (GO); genomics; microarray; multiple hypothesis testing; resampling (1)
- Arabidopsis thaliana (1)
- BLUPs; Kernel function; Model/variable selection; Nonparametric regression; Penalized likelihood; REML; Score test; Smoothing parameter; Support vector machines (1)
- Bayesian (1)
- Bayesian inference (1)
- Publication Year
- Publication
-
- U.C. Berkeley Division of Biostatistics Working Paper Series (7)
- Fisheries research reports (5)
- The University of Michigan Department of Biostatistics Working Paper Series (5)
- COBRA Preprint Series (4)
- Dissertations, Master's Theses and Master's Reports (3)
-
- Harvard University Biostatistics Working Paper Series (3)
- Johns Hopkins University, Dept. of Biostatistics Working Papers (2)
- Bioconductor Project Working Papers (1)
- Bioinformatics Faculty Publications (1)
- Doctoral Dissertations (1)
- Electronic Theses and Dissertations (1)
- Heather Wheeler (1)
- Jeffrey S. Morris (1)
- Psychology Faculty Publications (1)
- UW Biostatistics Working Paper Series (1)
- Publication Type
Articles 1 - 30 of 37
Full-Text Articles in Statistical Models
Statistical Methods For Gene Selection And Genetic Association Studies, Xuewei Cao
Statistical Methods For Gene Selection And Genetic Association Studies, Xuewei Cao
Dissertations, Master's Theses and Master's Reports
This dissertation includes five Chapters. A brief description of each chapter is organized as follows.
In Chapter One, we propose a signed bipartite genotype and phenotype network (GPN) by linking phenotypes and genotypes based on the statistical associations. It provides a new insight to investigate the genetic architecture among multiple correlated phenotypes and explore where phenotypes might be related at a higher level of cellular and organismal organization. We show that multiple phenotypes association studies by considering the proposed network are improved by incorporating the genetic information into the phenotype clustering.
In Chapter Two, we first illustrate the proposed GPN …
Bayesian Methods For Graphical Models With Neighborhood Selection., Sagnik Bhadury
Bayesian Methods For Graphical Models With Neighborhood Selection., Sagnik Bhadury
Electronic Theses and Dissertations
Graphical models determine associations between variables through the notion of conditional independence. Gaussian graphical models are a widely used class of such models, where the relationships are formalized by non-null entries of the precision matrix. However, in high-dimensional cases, covariance estimates are typically unstable. Moreover, it is natural to expect only a few significant associations to be present in many realistic applications. This necessitates the injection of sparsity techniques into the estimation method. Classical frequentist methods, like GLASSO, use penalization techniques for this purpose. Fully Bayesian methods, on the contrary, are slow because they require iteratively sampling over a quadratic …
Ecological Risk Assessment For The Temperate Demersal Elasmobranch Resource, Department Of Primary Industries And Regional Development, Western Australia
Ecological Risk Assessment For The Temperate Demersal Elasmobranch Resource, Department Of Primary Industries And Regional Development, Western Australia
Fisheries research reports
No abstract provided.
Otoliths Of South-Western Australian Fish: A Photographic Catalogue, Chris Dowling, Kim Smith, Elain Lek, Joshua Brown
Otoliths Of South-Western Australian Fish: A Photographic Catalogue, Chris Dowling, Kim Smith, Elain Lek, Joshua Brown
Fisheries research reports
No abstract provided.
Squid And Cuttlefish Resources Of Western Australia, Daniel Yeoh, Danielle J. Johnston Phd, David C. Harris
Squid And Cuttlefish Resources Of Western Australia, Daniel Yeoh, Danielle J. Johnston Phd, David C. Harris
Fisheries research reports
No abstract provided.
Statistical Methods In Genetic Studies, Cheng Gao
Statistical Methods In Genetic Studies, Cheng Gao
Dissertations, Master's Theses and Master's Reports
This dissertation includes three Chapters. A brief description of each chapter is organized as follows.
In Chapter 1, we proposed a new method, called MF-TOWmuT, for genome-wide association studies with multiple genetic variants and multiple phenotypes using family samples. MF-TOWmuT uses kinship matrix to account for sample relatedness. It is worth mentioning that in simulations, we considered hidden polygenic effects and varied the proportion of variance contributed by it to generate phenotypes. Simulation studies show that MF-TOWmuT can preserve the type I error rates and is more powerful than several existing methods in different simulation scenarios, MFTOWmuT is also quite …
Construction And Analysis Of Genetic Regulatory Networks With Rna-Seq Data From Arabidopsis Thaliana, Tessa Kriz
Construction And Analysis Of Genetic Regulatory Networks With Rna-Seq Data From Arabidopsis Thaliana, Tessa Kriz
Dissertations, Master's Theses and Master's Reports
Reconstruction of gene regulatory networks (GRNs) is a fundamental aspect of genetic engineering and provides a deeper understanding of the biological processes of an organism. Two methods were implemented to reconstruct the gene regulatory networks of Arabidopsis thaliana under two treatments: methyl jasmonate (MeJa) and salicylic acid (SA). The Joint Reconstruction of multiple Gene Regulatory Networks (JRmGRN) method was utilized to construct a joint network for identifying hub genes common to both conditions in addition to networks specific to each condition. The Differential Network Analysis with False Discover Rate Control method constructed a network of connections unique to only one …
Population Viability And Connectivity Of The Federally Threatened Eastern Indigo Snake In Central Peninsular Florida, Javan Bauder
Population Viability And Connectivity Of The Federally Threatened Eastern Indigo Snake In Central Peninsular Florida, Javan Bauder
Doctoral Dissertations
Understanding the factors influencing the likelihood of persistence of real-world populations requires both an accurate understanding of the traits and behaviors of individuals within those populations (e.g., movement, habitat selection, survival, fecundity, dispersal) but also an understanding of how those traits and behaviors are influenced by landscape features. The federally threatened eastern indigo snake (EIS, Drymarchon couperi) has declined throughout its range primarily due to anthropogenically-induced habitat loss and fragmentation making spatially-explicit assessments of population viability and connectivity essential for understanding its current status and directing future conservation efforts. The primary goal of my dissertation was to understand how …
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
COBRA Preprint Series
One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards …
Resource Assessment Report Temperate Demersal Elasmobranch Resource Of Western Australia, Matias Braccini, Nick Blay, S. A. Hesp, Brett Molony
Resource Assessment Report Temperate Demersal Elasmobranch Resource Of Western Australia, Matias Braccini, Nick Blay, S. A. Hesp, Brett Molony
Fisheries research reports
This document provides a cumulative description and assessment of the TDER and all of the fishing activities (i.e. fisheries / fishing sectors) affecting this resource in WA. Future Resource Assessment Reports will assess the Statewide Sharks and Rays Resource. The report is focused on the temperate indicator species (whiskery, gummy, dusky and sandbar sharks) used to assess the suites of demersal sharks and rays that comprise this resource. These species are primarily captured by demersal gillnets used in the TDGDLF that operate in the West Coast and South Coast Bioregions. For the North Coast bioregion, no commercial fishing for sharks …
Australian Herring And West Australian Salmon Scientific Workshop Report, October 2017, Department Of Primary Industries And Regional Development, Western Australia
Australian Herring And West Australian Salmon Scientific Workshop Report, October 2017, Department Of Primary Industries And Regional Development, Western Australia
Fisheries research reports
No abstract provided.
A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im
A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im
Heather Wheeler
Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual’s genetic profile and correlates ‘imputed’ gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. Genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome data sets. PrediXcan enjoys …
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
UW Biostatistics Working Paper Series
We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …
A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Gtex Consortium, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im
A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Gtex Consortium, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im
Bioinformatics Faculty Publications
Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual’s genetic profile and correlates ‘imputed’ gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. Genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome data sets. PrediXcan enjoys …
Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull
Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull
Jeffrey S. Morris
Current methods for conducting expression Quantitative Trait Loci (eQTL) analysis are limited in scope to a pairwise association testing between a single nucleotide polymorphism (SNPs) and expression probe set in a region around a gene of interest, thus ignoring the inherent between-SNP correlation. To determine association, p-values are then typically adjusted using Plug-in False Discovery Rate. As many SNPs are interrogated in the region and multiple probe-sets taken, the current approach requires the fitting of a large number of models. We propose to remedy this by introducing a flexible function-on-scalar regression that models the genome as a functional outcome. The …
A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez
A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez
COBRA Preprint Series
We present a novel approach to address genome association studies between single nucleotide polymorphisms (SNPs) and disease. We propose a Bayesian shared component model to tease out the genotype information that is common to cases and controls from the one that is specific to cases only. This allows to detect the SNPs that show the strongest association with the disease. The model can be applied to case-control studies with more than one disease. In fact, we illustrate the use of this model with a dataset of 23,418 SNPs from a case-control study by The Welcome Trust Case Control Consortium (2007) …
Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel
Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel
COBRA Preprint Series
The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.
We …
Multiple Imputation To Correct For Measurement Error In Admixture Estimates In Genetic Structured Association Testing, Miguel A. Padilla, Jamin Divers, Laura K. Vaughan, David B. Allison, Hemant K. Tiwari
Multiple Imputation To Correct For Measurement Error In Admixture Estimates In Genetic Structured Association Testing, Miguel A. Padilla, Jamin Divers, Laura K. Vaughan, David B. Allison, Hemant K. Tiwari
Psychology Faculty Publications
Objectives: Structured association tests ( SAT), like any statistical model, assumes that all variables are measured without error. Measurement error can bias parameter estimates and confound residual variance in linear models. It has been shown that admixture estimates can be contaminated with measurement error causing SAT models to suffer from the same afflictions. Multiple imputation (MI) is presented as a viable tool for correcting measurement error problems in SAT linear models with emphasis on correcting measurement error contaminated admixture estimates. Methods: Several MI methods are presented and compared, via simulation, in terms of controlling Type I error rates for both …
Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai
Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai
Harvard University Biostatistics Working Paper Series
No abstract provided.
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Harvard University Biostatistics Working Paper Series
No abstract provided.
Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan
Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
We propose a general and formal statistical framework for the multiple tests of associations between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known fixed gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating genome-wide transcript levels or DNA copy numbers to possibly censored biological and …
A Pseudolikelihood Approach For Simultaneous Analysis Of Array Comparative Genomic Hybridizations (Acgh), David A. Engler, Gayatry Mohapatra, David N. Louis, Rebecca Betensky
A Pseudolikelihood Approach For Simultaneous Analysis Of Array Comparative Genomic Hybridizations (Acgh), David A. Engler, Gayatry Mohapatra, David N. Louis, Rebecca Betensky
Harvard University Biostatistics Working Paper Series
DNA sequence copy number has been shown to be associated with cancer development and progression. Array-based Comparative Genomic Hybridization (aCGH) is a recent development that seeks to identify the copy number ratio at large numbers of markers across the genome. Due to experimental and biological variations across chromosomes and across hybridizations, current methods are limited to analyses of single chromosomes. We propose a more powerful approach that borrows strength across chromosomes and across hybridizations. We assume a Gaussian mixture model, with a hidden Markov dependence structure, and with random effects to allow for intertumoral variation, as well as intratumoral clonal …
Application Of A Multiple Testing Procedure Controlling The Proportion Of False Positives To Protein And Bacterial Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan
Application Of A Multiple Testing Procedure Controlling The Proportion Of False Positives To Protein And Bacterial Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Simultaneously testing multiple hypotheses is important in high-dimensional biological studies. In these situations, one is often interested in controlling the Type-I error rate, such as the proportion of false positives to total rejections (TPPFP) at a specific level, alpha. This article will present an application of the E-Bayes/Bootstrap TPPFP procedure, presented in van der Laan et al. (2005), which controls the tail probability of the proportion of false positives (TPPFP), on two biological datasets. The two data applications include firstly, the application to a mass-spectrometry dataset of two leukemia subtypes, AML and ALL. The protein data measurements include intensity and …
New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski
New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski
COBRA Preprint Series
As the field of functional genetics and genomics is beginning to mature, we become confronted with new challenges. The constant drop in price for sequencing and gene expression profiling as well as the increasing number of genetic and genomic variables that can be measured makes it feasible to address more complex questions. The success with rare diseases caused by single loci or genes has provided us with a proof-of-concept that new therapies can be developed based on functional genomics and genetics.
Common diseases, however, typically involve genetic epistasis, genomic pathways, and proteomic pattern. Moreover, to better understand the underlying biologi-cal …
A Bayesian Method For Finding Interactions In Genomic Studies, Wei Chen, Debashis Ghosh, Trivellore E. Raghuanthan, Sharon Kardia
A Bayesian Method For Finding Interactions In Genomic Studies, Wei Chen, Debashis Ghosh, Trivellore E. Raghuanthan, Sharon Kardia
The University of Michigan Department of Biostatistics Working Paper Series
An important step in building a multiple regression model is the selection of predictors. In genomic and epidemiologic studies, datasets with a small sample size and a large number of predictors are common. In such settings, most standard methods for identifying a good subset of predictors are unstable. Furthermore, there is an increasing emphasis towards identification of interactions, which has not been studied much in the statistical literature. We propose a method, called BSI (Bayesian Selection of Interactions), for selecting predictors in a regression setting when the number of predictors is considerably larger than the sample size with a focus …
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
One of the benefits of profiling of cancer samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Such subgroups have typically been found in microarray data using hierarchical clustering. A major problem in interpretation of the output is determining the number of clusters. We approach the problem of determining disease subtypes using mixture models. A novel estimation procedure of the parameters in the mixture model is developed based on a combination of random projections and the expectation-maximization algorithm. Because the approach is probabilistic, our approach provides a measure for the number of true clusters …
Semiparametric Quantitative-Trait-Locus Mapping: I. On Functional Growth Curves, Ying Qing Chen, Rongling Wu
Semiparametric Quantitative-Trait-Locus Mapping: I. On Functional Growth Curves, Ying Qing Chen, Rongling Wu
U.C. Berkeley Division of Biostatistics Working Paper Series
The genetic study of certain quantitative traits in growth curves as a function of time has recently been of major scientific interest to explore the developmental evolution processes of biological subjects. Various parametric approaches in the statistical literature have been proposed to study the quantitative-trait-loci (QTL) mapping of the growth curves as multivariate outcomes. In this article, we view the growth curves as functional quantitative traits and propose some semiparametric models to relax the strong parametric assumptions which may not be always practical in reality. Appropriate inference procedures are developed to estimate the parameters of interest which characterise the possible …
Semiparametric Quantitative-Trait-Locus Mapping: Ii. On Censored Age-At-Onset, Ying Qing Chen, Chengcheng Hu, Rongling Wu
Semiparametric Quantitative-Trait-Locus Mapping: Ii. On Censored Age-At-Onset, Ying Qing Chen, Chengcheng Hu, Rongling Wu
U.C. Berkeley Division of Biostatistics Working Paper Series
In genetic studies, the variation in genotypes may not only affect different inheritance patterns in qualitative traits, but may also affect the age-at-onset as quantitative trait. In this article, we use standard cross designs, such as backcross or F2, to propose some hazard regression models, namely, the additive hazards model in quantitative trait loci mapping for age-at-onset, although the developed method can be extended to more complex designs. With additive invariance of the additive hazards models in mixture probabilities, we develop flexible semiparametric methodologies in interval regression mapping without heavy computing burden. A recently developed multiple comparison procedures is adapted …
Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh
Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
Due to the advent of high-throughput genomic technology, it has become possible to globally monitor cellular activities on a genomewide basis. With these new methods, scientists can begin to address important biological questions. One such question involves the identification of replication origins, which are regions in chromosomes where DNA replication is initiated. In addition, one hypothesis regarding replication origins is that their locations are non-random throughout the genome. In this article, we develop methods for identification of and cluster inference regarding replication origins involving genomewide expression data. We compare several nonparametric regression methods for the identification of replication origin locations. …
Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan
Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan
The University of Michigan Department of Biostatistics Working Paper Series
The use of microarray data has become quite commonplace in medical and scientific experiments. We focus here on microarray data generated from cancer studies. It is potentially important for the discovery of biomarkers to identify genes whose expression levels correlate with tumor progression. In this article, we develop statistical procedures for the identification of such genes, which we term tumor progression genes. Two methods are considered in this paper. The first is use of a proportional odds procedure, combined with false discovery rate estimation techniques to adjust for the multiple testing problem. The second method is based on order-restricted estimation …