Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 10 of 10

Full-Text Articles in Statistical Models

Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang Oct 2023

Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang

Statistical Science Theses and Dissertations

Spatially resolved transcriptomics (SRT) quantifies expression levels at different spatial locations, providing a new and powerful tool to investigate novel biological insights. As experimental technologies enhance both in capacity and efficiency, there arises a growing demand for the development of analytical methodologies.

One question in SRT data analysis is to identify genes whose expressions exhibit spatially correlated patterns, called spatially variable (SV) genes. Most current methods to identify SV genes are built upon the geostatistical model with Gaussian process, which could limit the models' ability to identify complex spatial patterns. In order to overcome this challenge and capture more types …


Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi Jan 2023

Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi

Honors Theses

All populations display patterns in allele frequencies over time. Some alleles cease to exist, while some grow to become the norm. These frequencies can shift or stay constant based on the conditions the population lives in. If in Hardy-Weinberg equilibrium, the allele frequencies stay constant. Most populations, however, have bias from environmental factors, sexual preferences, other organisms, etc. We propose a stochastic Markov chain model to study allele progression across generations. In such a model, the allele frequencies in the next generation depend only on the frequencies in the current one.

We use this model to track a recessive allele …


Hierarchical Modeling And Differential Expression Analysis For Rna-Seq Experiments With Inbred And Hybrid Genotypes, Andrew Lithio, Dan Nettleton Jul 2019

Hierarchical Modeling And Differential Expression Analysis For Rna-Seq Experiments With Inbred And Hybrid Genotypes, Andrew Lithio, Dan Nettleton

Dan Nettleton

The performance of inbred and hybrid genotypes is of interest in plant breeding and genetics. High-throughput sequencing of RNA (RNA-seq) has proven to be a useful tool in the study of the molecular genetic responses of inbreds and hybrids to environmental stresses. Commonly used experimental designs and sequencing methods lead to complex data structures that require careful attention in data analysis. We demonstrate an analysis of RNA-seq data from a split-plot design involving drought stress applied to two inbred genotypes and two hybrids formed by crosses between the inbreds. Our generalized linear modeling strategy incorporates random effects for whole-plot experimental …


The Battle Against Malaria: A Teachable Moment, Randy K. Schwartz Feb 2017

The Battle Against Malaria: A Teachable Moment, Randy K. Schwartz

Journal of Humanistic Mathematics

Malaria has been humanity’s worst public health problem throughout recorded history. Mathematical methods are needed to understand which factors are relevant to the disease and to develop counter-measures against it. This article and the accompanying exercises provide examples of those methods for use in lower- or upper-level courses dealing with probability, statistics, or population modeling. These can be used to illustrate such concepts as correlation, causation, conditional probability, and independence. The article explains how the apparent link between sickle cell trait and resistance to malaria was first verified in Uganda using the chi-squared probability distribution. It goes on to explain …


A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez Nov 2010

A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez

COBRA Preprint Series

We present a novel approach to address genome association studies between single nucleotide polymorphisms (SNPs) and disease. We propose a Bayesian shared component model to tease out the genotype information that is common to cases and controls from the one that is specific to cases only. This allows to detect the SNPs that show the strongest association with the disease. The model can be applied to case-control studies with more than one disease. In fact, we illustrate the use of this model with a dataset of 23,418 SNPs from a case-control study by The Welcome Trust Case Control Consortium (2007) …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai Nov 2007

Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan Mar 2006

Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a general and formal statistical framework for the multiple tests of associations between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known fixed gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating genome-wide transcript levels or DNA copy numbers to possibly censored biological and …


New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski May 2005

New Statistical Paradigms Leading To Web-Based Tools For Clinical/Translational Science, Knut M. Wittkowski

COBRA Preprint Series

As the field of functional genetics and genomics is beginning to mature, we become confronted with new challenges. The constant drop in price for sequencing and gene expression profiling as well as the increasing number of genetic and genomic variables that can be measured makes it feasible to address more complex questions. The success with rare diseases caused by single loci or genes has provided us with a proof-of-concept that new therapies can be developed based on functional genomics and genetics.

Common diseases, however, typically involve genetic epistasis, genomic pathways, and proteomic pattern. Moreover, to better understand the underlying biologi-cal …