Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 35

Full-Text Articles in Life Sciences

Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi Jan 2023

Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi

Honors Theses

All populations display patterns in allele frequencies over time. Some alleles cease to exist, while some grow to become the norm. These frequencies can shift or stay constant based on the conditions the population lives in. If in Hardy-Weinberg equilibrium, the allele frequencies stay constant. Most populations, however, have bias from environmental factors, sexual preferences, other organisms, etc. We propose a stochastic Markov chain model to study allele progression across generations. In such a model, the allele frequencies in the next generation depend only on the frequencies in the current one.

We use this model to track a recessive allele …


Genetics Of Pediatric Musculoskeletal Disorders, Lilian Antunes Jan 2021

Genetics Of Pediatric Musculoskeletal Disorders, Lilian Antunes

Arts & Sciences Electronic Theses and Dissertations

Pediatric musculoskeletal disorders are an extremely broad category of diseases that are often inherited. While individually rare, collectively these disorders are common, affecting around 3% of live births in the US. Despite the mounting clinical and molecular evidence for a genetic etiology, the cause for many patients with pediatric musculoskeletal disorders remain largely unknown. Major challenges in rare pediatric diseases include recruiting large numbers of patients and determining the significance and functional impacts of variants associated with disease within individuals or families. Whole exome sequencing (WES) is a powerful tool to identify coding variants that are associated with rare pediatric …


9th Annual Postdoctoral Science Symposium, University Of Texas Md Anderson Cancer Center Postdoctoral Association Sep 2019

9th Annual Postdoctoral Science Symposium, University Of Texas Md Anderson Cancer Center Postdoctoral Association

Annual Postdoctoral Science Symposium Abstracts

The mission of the Annual Postdoctoral Science Symposium (APSS) is to provide a platform for talented postdoctoral fellows throughout the Texas Medical Center to present their work to a wider audience. The MD Anderson Postdoctoral Association convened its inaugural Annual Postdoctoral Science Symposium (APSS) on August 4, 2011.

The APSS provides a professional venue for postdoctoral scientists to develop, clarify, and refine their research as a result of formal reviews and critiques of faculty and other postdoctoral scientists. Additionally, attendees discuss current research on a broad range of subjects while promoting academic interactions and enrichment and developing new collaborations.


Hierarchical Modeling And Differential Expression Analysis For Rna-Seq Experiments With Inbred And Hybrid Genotypes, Andrew Lithio, Dan Nettleton Jul 2019

Hierarchical Modeling And Differential Expression Analysis For Rna-Seq Experiments With Inbred And Hybrid Genotypes, Andrew Lithio, Dan Nettleton

Dan Nettleton

The performance of inbred and hybrid genotypes is of interest in plant breeding and genetics. High-throughput sequencing of RNA (RNA-seq) has proven to be a useful tool in the study of the molecular genetic responses of inbreds and hybrids to environmental stresses. Commonly used experimental designs and sequencing methods lead to complex data structures that require careful attention in data analysis. We demonstrate an analysis of RNA-seq data from a split-plot design involving drought stress applied to two inbred genotypes and two hybrids formed by crosses between the inbreds. Our generalized linear modeling strategy incorporates random effects for whole-plot experimental …


What Can We Do? Puzzling Over The Interpretation Of Heredity And Variation From Galton To Genetic Engineering, Peter J. Taylor May 2019

What Can We Do? Puzzling Over The Interpretation Of Heredity And Variation From Galton To Genetic Engineering, Peter J. Taylor

Working Papers on Science in a Changing World

First six chapters of a book motivated as follows: When I had mentioned to colleagues that I was exploring some significant issues overlooked by both sides in nature-nurture debates, the typical response was “we know, of course, that nature and nurture are intertwined”; they never asked “which nature-nurture science are you referring to?” It occurred to me that, in the long history of nature-nurture debates, opposing sides had always assumed or implied that these different scientific approaches were speaking to the same issues. If that were the case, then the challenge—something I was already puzzling over—was how best to draw …


Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna Jan 2019

Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna

Theses and Dissertations

Widely effective treatment for alcohol use disorder is not yet available, because the exact biological mechanisms that underlie this disorder are not completely understood. One way to gain a better understanding of these mechanisms is to examine the genetic frameworks that contribute to the risk for developing this disorder. This dissertation examines genetic association data in combination with gene expression networks in the brain to identify functional groups of genes associated with alcohol consumption and dependence.

The first study took advantage of the behavioral complexity of human samples, and experimental capabilities provided by mouse models, by co-analyzing gene expression networks …


Impact Of Home Visit Capacity On Genetic Association Studies Of Late-Onset Alzheimer's Disease, David W. Fardo, Laura E. Gibbons, Shubhabrata Mukherjee, M. Maria Glymour, Wayne Mccormick, Susan M. Mccurry, James D. Bowen, Eric B. Larson, Paul K. Crane Aug 2017

Impact Of Home Visit Capacity On Genetic Association Studies Of Late-Onset Alzheimer's Disease, David W. Fardo, Laura E. Gibbons, Shubhabrata Mukherjee, M. Maria Glymour, Wayne Mccormick, Susan M. Mccurry, James D. Bowen, Eric B. Larson, Paul K. Crane

Biostatistics Faculty Publications

INTRODUCTION—Findings for genetic correlates of late-onset Alzheimer's disease (LOAD) in studies that rely solely on clinic visits may differ from those with capacity to follow participants unable to attend clinic visits.

METHODS—We evaluated previously identified LOAD-risk single nucleotide variants in the prospective Adult Changes in Thought study, comparing hazard ratios (HRs) estimated using the full data set of both in-home and clinic visits (n = 1697) to HRs estimated using only data that were obtained from clinic visits (n = 1308). Models were adjusted for age, sex, principal components to account for ancestry, and additional health indicators.

RESULTS …


Spectral Gene Set Enrichment (Sgse), H Robert Frost, Zhigang Li, Jason H. Moore Mar 2015

Spectral Gene Set Enrichment (Sgse), H Robert Frost, Zhigang Li, Jason H. Moore

Dartmouth Scholarship

Gene set testing is typically performed in a supervised context to quantify the association between groups of genes and a clinical phenotype. In many cases, however, a gene set-based interpretation of genomic data is desired in the absence of a phenotype variable. Although methods exist for unsupervised gene set testing, they predominantly compute enrichment relative to clusters of the genomic variables with performance strongly dependent on the clustering algorithm and number of clusters. We propose a novel method, spectral gene set enrichment (SGSE), for unsupervised competitive testing of the association between gene sets and empirical data sources. SGSE first computes …


Dna Methylation Arrays As Surrogate Measures Of Cell Mixture Distribution, Eugene Houseman, William P. Accomando, Devin C. Koestler, Brock C. Christensen, Carmen J. Marsit May 2012

Dna Methylation Arrays As Surrogate Measures Of Cell Mixture Distribution, Eugene Houseman, William P. Accomando, Devin C. Koestler, Brock C. Christensen, Carmen J. Marsit

Dartmouth Scholarship

There has been a long-standing need in biomedical research for a method that quantifies the normally mixed composition of leukocytes beyond what is possible by simple histological or flow cytometric assessments. The latter is restricted by the labile nature of protein epitopes, requirements for cell processing, and timely cell analysis. In a diverse array of diseases and following numerous immune-toxic exposures, leukocyte composition will critically inform the underlying immuno-biology to most chronic medical conditions. Emerging research demonstrates that DNA methylation is responsible for cellular differentiation, and when measured in whole peripheral blood, serves to distinguish cancer cases from controls.


Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan Oct 2011

Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We define a new measure of variable importance of an exposure on a continuous outcome, accounting for potential confounders. The exposure features a reference level x0 with positive mass and a continuum of other levels. For the purpose of estimating it, we fully develop the semi-parametric estimation methodology called targeted minimum loss estimation methodology (TMLE) [van der Laan & Rubin, 2006; van der Laan & Rose, 2011]. We cover the whole spectrum of its theoretical study (convergence of the iterative procedure which is at the core of the TMLE methodology; consistency and asymptotic normality of the estimator), practical implementation, simulation …


A Generalized Approach For Testing The Association Of A Set Of Predictors With An Outcome: A Gene Based Test, Benjamin A. Goldstein, Alan E. Hubbard, Lisa F. Barcellos Jan 2011

A Generalized Approach For Testing The Association Of A Set Of Predictors With An Outcome: A Gene Based Test, Benjamin A. Goldstein, Alan E. Hubbard, Lisa F. Barcellos

U.C. Berkeley Division of Biostatistics Working Paper Series

In many analyses, one has data on one level but desires to draw inference on another level. For example, in genetic association studies, one observes units of DNA referred to as SNPs, but wants to determine whether genes that are comprised of SNPs are associated with disease. While there are some available approaches for addressing this issue, they usually involve making parametric assumptions and are not easily generalizable. A statistical test is proposed for testing the association of a set of variables with an outcome of interest. No assumptions are made about the functional form relating the variables to the …


Clustering With Exclusion Zones: Genomic Applications, Mark Segal, Yuanyuan Xiao, Fred Huffer Dec 2010

Clustering With Exclusion Zones: Genomic Applications, Mark Segal, Yuanyuan Xiao, Fred Huffer

Mark R Segal

Methods for formally evaluating the clustering of events in space or time, notably the scan statistic, have been richly developed and widely applied. In order to utilize the scan statistic and related approaches, it is necessary to know the extent of the spatial or temporal domains wherein the events arise. Implicit in their usage is that these domains have no “holes”—hereafter “exclusion zones”—regions in which events a priori cannot occur. However, in many contexts, this requirement is not met. When the exclusion zones are known, it is straightforward to correct the scan statistic for their occurrence by simply adjusting the …


Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel Dec 2010

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel

COBRA Preprint Series

In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …


A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez Nov 2010

A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez

COBRA Preprint Series

We present a novel approach to address genome association studies between single nucleotide polymorphisms (SNPs) and disease. We propose a Bayesian shared component model to tease out the genotype information that is common to cases and controls from the one that is specific to cases only. This allows to detect the SNPs that show the strongest association with the disease. The model can be applied to case-control studies with more than one disease. In fact, we illustrate the use of this model with a dataset of 23,418 SNPs from a case-control study by The Welcome Trust Case Control Consortium (2007) …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Powerful Snp Set Analysis For Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, Xihong Lin May 2010

Powerful Snp Set Analysis For Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Sparse Linear Discriminant Analysis For Simultaneous Testing For The Significance Of A Gene Set/Pathway And Gene Selection, Michael C. Wu, Lingson Zhang, Zhaoxi Wang, David C. Christiani, Xihong Lin Jan 2009

Sparse Linear Discriminant Analysis For Simultaneous Testing For The Significance Of A Gene Set/Pathway And Gene Selection, Michael C. Wu, Lingson Zhang, Zhaoxi Wang, David C. Christiani, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin Jun 2008

Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein Jun 2008

A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein

Harvard University Biostatistics Working Paper Series

No abstract provided.


Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai Nov 2007

Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Assessment Of A Cgh-Based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, Rebecca A. Betensky Jul 2007

Assessment Of A Cgh-Based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, Rebecca A. Betensky

Harvard University Biostatistics Working Paper Series

No abstract provided.


Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li Jul 2007

Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li

Harvard University Biostatistics Working Paper Series

Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time …


Statistical Evaluation Of Evidence For Clonal Allelic Alterations In Array-Cgh Experiments, Colin B. Begg, Kevin Eng, Adam Olshen, E S. Venkatraman Mar 2007

Statistical Evaluation Of Evidence For Clonal Allelic Alterations In Array-Cgh Experiments, Colin B. Begg, Kevin Eng, Adam Olshen, E S. Venkatraman

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

In recent years numerous investigators have conducted genetic studies of pairs of tumor specimens from the same patient to determine whether the tumors share a clonal origin. These studies have the potential to be of considerable clinical significance, especially in clinical settings where the distinction of a new primary cancer and metastatic spread of a previous cancer would lead to radically different indications for treatment. Studies of clonality have typically involved comparison of the patterns of somatic mutations in the tumors at candidate genetic loci to see if the patterns are sufficiently similar to indicate a clonal origin. More recently, …


Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond Feb 2007

Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond

UW Biostatistics Working Paper Series

Whole-genome studies are becoming a mainstay of biomedical research. Examples include expression array experiments, comparative genomic hybridization analyses and large case-control studies for detecting polymorphism/disease associations. The tactic of applying a regression model to every locus to obtain test statistics is useful in such studies. However, this approach ignores potential correlation structure in the data that could be used to gain power, particularly when a Bonferroni correction is applied to adjust for multiple testing. In this article, we propose using regression techniques for misspecified multivariate outcomes to increase statistical power over independence-based modeling at each locus. Even when the outcome …


Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng Aug 2006

Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng

Harvard University Biostatistics Working Paper Series

No abstract provided.


Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin Aug 2006

Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin Aug 2006

Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin Aug 2006

A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross May 2006

Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross

Dartmouth Scholarship

The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-counting strategy for their identification. While numerous methods exist for inferring base distributions using a position weight matrix, recent studies suggest that the independence assumptions inherent in the model, as well as the inability to reach a global optimum, limit this approach.