Open Access. Powered by Scholars. Published by Universities.®

Medical Genetics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 48

Full-Text Articles in Medical Genetics

Epigenome-Wide Association Study Of Kidney Function Identifies Trans-Ethnic And Ethnic-Specific Loci, Charles E. Breeze, Anna Batorsky, Mi Kyeong Lee, Mindy D. Szeto, Xiaoguang Xu, Daniel L. Mccartney, Rong Jiang, Amit Patki, Holly J. Kramer, James M. Eales, Laura Raffield, Leslie Lange, Ethan Lange, Peter Durda, Yongmei Liu, Russ P. Tracy, David Van Den Berg, Nhlbi Trans-Omics For Precision Medicine (Topmed) Consortium, Topmed Mesa Multi-Omics Working Group, Kathryn L. Evans, William E. Kraus, Donna K. Arnett Apr 2021

Epigenome-Wide Association Study Of Kidney Function Identifies Trans-Ethnic And Ethnic-Specific Loci, Charles E. Breeze, Anna Batorsky, Mi Kyeong Lee, Mindy D. Szeto, Xiaoguang Xu, Daniel L. Mccartney, Rong Jiang, Amit Patki, Holly J. Kramer, James M. Eales, Laura Raffield, Leslie Lange, Ethan Lange, Peter Durda, Yongmei Liu, Russ P. Tracy, David Van Den Berg, Nhlbi Trans-Omics For Precision Medicine (Topmed) Consortium, Topmed Mesa Multi-Omics Working Group, Kathryn L. Evans, William E. Kraus, Donna K. Arnett

Epidemiology and Environmental Health Faculty Publications

BACKGROUND: DNA methylation (DNAm) is associated with gene regulation and estimated glomerular filtration rate (eGFR), a measure of kidney function. Decreased eGFR is more common among US Hispanics and African Americans. The causes for this are poorly understood. We aimed to identify trans-ethnic and ethnic-specific differentially methylated positions (DMPs) associated with eGFR using an agnostic, genome-wide approach.

METHODS: The study included up to 5428 participants from multi-ethnic studies for discovery and 8109 participants for replication. We tested the associations between whole blood DNAm and eGFR using beta values from Illumina 450K or EPIC arrays. Ethnicity-stratified analyses were performed using linear …


Understanding The Effect Of Adaptive Mutations On The Three-Dimensional Structure Of Rna, Justin Cook Apr 2021

Understanding The Effect Of Adaptive Mutations On The Three-Dimensional Structure Of Rna, Justin Cook

Undergraduate Research and Scholarship Symposium

Single-nucleotide polymorphisms (SNPs) are variations in the genome where one base pair can differ between individuals.1 SNPs occur throughout the genome and can correlate to a disease-state if they occur in a functional region of DNA.1According to the central dogma of molecular biology, any variation in the DNA sequence will have a direct effect on the RNA sequence and will potentially alter the identity or conformation of a protein product. A single RNA molecule, due to intramolecular base pairing, can acquire a plethora of 3-D conformations that are described by its structural ensemble. One SNP, rs12477830, which …


Subject Level Clustering Using A Negative Binomial Model For Small Transcriptomic Studies., Qian Li, Janelle R. Noel-Macdonnell, Devin C. Koestler, Ellen L. Goode, Brooke L. Fridley Dec 2018

Subject Level Clustering Using A Negative Binomial Model For Small Transcriptomic Studies., Qian Li, Janelle R. Noel-Macdonnell, Devin C. Koestler, Ellen L. Goode, Brooke L. Fridley

Manuscripts, Articles, Book Chapters and Other Papers

BACKGROUND: Unsupervised clustering represents one of the most widely applied methods in analysis of high-throughput 'omics data. A variety of unsupervised model-based or parametric clustering methods and non-parametric clustering methods have been proposed for RNA-seq count data, most of which perform well for large samples, e.g. N ≥ 500. A common issue when analyzing limited samples of RNA-seq count data is that the data follows an over-dispersed distribution, and thus a Negative Binomial likelihood model is often used. Thus, we have developed a Negative Binomial model-based (NBMB) clustering approach for application to RNA-seq studies.

RESULTS: We have developed a Negative …


Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry Jan 2018

Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry

Theses and Dissertations

The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and …


The Battle Against Malaria: A Teachable Moment, Randy K. Schwartz Feb 2017

The Battle Against Malaria: A Teachable Moment, Randy K. Schwartz

Journal of Humanistic Mathematics

Malaria has been humanity’s worst public health problem throughout recorded history. Mathematical methods are needed to understand which factors are relevant to the disease and to develop counter-measures against it. This article and the accompanying exercises provide examples of those methods for use in lower- or upper-level courses dealing with probability, statistics, or population modeling. These can be used to illustrate such concepts as correlation, causation, conditional probability, and independence. The article explains how the apparent link between sickle cell trait and resistance to malaria was first verified in Uganda using the chi-squared probability distribution. It goes on to explain …


Functional Linear Models Extensions Uncover Pleiotropic Effects Of Chronic Pain Phenotypes, Dmitri V. Zaykin, L. Qing, G. D. Slade, R. Dubner, R. B. Fillingim, J. D. Greenspan, R. Ohrbach, W. Maixner, L. B. Diatchenko, Olga A. Vsevolozhskaya Oct 2015

Functional Linear Models Extensions Uncover Pleiotropic Effects Of Chronic Pain Phenotypes, Dmitri V. Zaykin, L. Qing, G. D. Slade, R. Dubner, R. B. Fillingim, J. D. Greenspan, R. Ohrbach, W. Maixner, L. B. Diatchenko, Olga A. Vsevolozhskaya

Biostatistics Presentations

Growing scientific evidence suggests that intricate interactions of genetic risk factors with environmental exposures play a major role in the development of chronic pain conditions. In studies of relative contribution of an individual’s genetic composition to the perception of pain, the general characteristics of pain sensitivity are typically measured by a wide range of different, yet possibly etiologically related pain phenotypes. Testing each of these pain-perception traits individually is subject to problems of multiple testing and low statistical power. Furthermore, pain-related traits may share common etiology and comprise binary, categorical, and quantitative measurements. In the current study, we propose a …


Genetics Of Obesity In Starr County, Texas Mexican Americans, Heather M. Highland May 2015

Genetics Of Obesity In Starr County, Texas Mexican Americans, Heather M. Highland

Dissertations & Theses (Open Access)

Currently, over two-thirds of Americans are classified as over-weight or obese. Obesity increases risk for many other diseases including type 2 diabetes, heart disease, stroke, and cancer, making obesity the largest public health problem in America and most other Westernized nations. Hispanics have a higher rate of both obesity and type 2 diabetes, making them a particularly interesting population in which to study obesity. For the last 33 years, the Starr County Health Studies has collected an array of phenotypes and biological samples from residents of Starr County, along Texas-Mexico border. This study includes 825 subjects who were not known …


Piscine Myocarditis Virus (Pmcv) In Wild Atlantic Salmon Salmo Salar, Torstein Tengs Dr. Dec 2012

Piscine Myocarditis Virus (Pmcv) In Wild Atlantic Salmon Salmo Salar, Torstein Tengs Dr.

Dr. Torstein Tengs

Cardiomyopathy syndrome (CMS) is a severe cardiac disease of sea-farmed Atlantic salmon Salmo salar L., but CMS-like lesions have also been found in wild Atlantic salmon. In 2010 a double-stranded RNA virus of the Totiviridae family, provisionally named piscine myocarditis virus (PMCV), was described as the causative agent of CMS. In the present paper we report the first detection of PMCV in wild Atlantic salmon. The study is based on screening of 797 wild Atlantic salmon by real-time RT-PCR. The samples were collected from 35 different rivers along the coast of Norway, and all individuals included in the study were …


Prevalence Of Tick Borne Encephalitis Virus In Tick Nymphs In Relation To Climatic Factors On The Southern Coast Of Norway, Torstein Tengs Dr. Aug 2012

Prevalence Of Tick Borne Encephalitis Virus In Tick Nymphs In Relation To Climatic Factors On The Southern Coast Of Norway, Torstein Tengs Dr.

Dr. Torstein Tengs

BACKGROUND

Tick-borne encephalitis (TBE) is among the most important vector borne diseases of humans in Europe and is currently identified as a major health problem in many countries. TBE endemic zones have expanded over the past two decades, as well as the number of reported cases within endemic areas. Multiple factors are ascribed for the increased incidence of TBE, including climatic change. The number of TBE cases has also increased in Norway over the past decade, and the human cases cluster along the southern coast of Norway. In Norway the distribution and prevalence of TBE virus (TBEV) in tick populations …


A Strain Of Piscine Myocarditis Virus (Pmcv) Infecting Argentina Silus (Ascanius), Torstein Tengs Dr. Jul 2012

A Strain Of Piscine Myocarditis Virus (Pmcv) Infecting Argentina Silus (Ascanius), Torstein Tengs Dr.

Dr. Torstein Tengs

No abstract.


Quantification Of Piscine Reovirus (Prv) At Different Stages Of Atlantic Salmon Salmo Salar Production, Torstein Tengs Dr. May 2012

Quantification Of Piscine Reovirus (Prv) At Different Stages Of Atlantic Salmon Salmo Salar Production, Torstein Tengs Dr.

Dr. Torstein Tengs

The newly described piscine reovirus (PRV) appears to be associated with the development of heart and skeletal muscle inflammation (HSMI) in farmed Atlantic salmon Salmo salar L. PRV seems to be ubiquitous among fish in Norwegian salmon farms, but high viral loads and tissue distribution support a causal relationship between virus and disease. In order to improve understanding of the distribution of PRV in the salmon production line, we quantified PRV by using real-time PCR on heart samples collected at different points in the life cycle from pre-smolts to fish ready for slaughter. PRV positive pre-smolts were found in about …


James-Stein Estimation And The Benjamini-Hochberg Procedure, Debashis Ghosh Jan 2012

James-Stein Estimation And The Benjamini-Hochberg Procedure, Debashis Ghosh

Debashis Ghosh

For the problem of multiple testing, the Benjamini-Hochberg (B-H) procedure has become a very popular method in applications. Based on a spacings theory representation of the B-H procedure, we are able to motivate the use of shrinkage estimators for modifying the B-H procedure. Several generalizations in the paper are discussed, and the methodology is applied to real and simulated datasets.


Shrinkage In Adaptive Procedures For False Discovery Rate Estimation In Multiple Testing: Structure And Synthesis, Debashis Ghosh Jan 2012

Shrinkage In Adaptive Procedures For False Discovery Rate Estimation In Multiple Testing: Structure And Synthesis, Debashis Ghosh

Debashis Ghosh

There has been much interest in the study of adaptive estimation procedures for controlling the false discovery rate (FDR). In this article, we take the direct approach to estimation of FDR of Storey (2002) and show how it can reexpressed as a particular type of shrinkage estimator. This representation leads to natural conditions on finite-sample FDR control for a general class of shrinkage estimators. In addition, many previous proposals from the literature can be unified under this framework for which finite-sample FDR results can be developed. Some asymptotic results are also provided.


Gene By Bmi Interactions Influencing C-Reactive Protein Levels In European-Americans, Sarah Tudor Aug 2011

Gene By Bmi Interactions Influencing C-Reactive Protein Levels In European-Americans, Sarah Tudor

Dissertations & Theses (Open Access)

C-Reactive Protein (CRP) is a biomarker indicating tissue damage, inflammation, and infection. High-sensitivity CRP (hsCRP) is an emerging biomarker often used to estimate an individual’s risk for future coronary heart disease (CHD). hsCRP levels falling below 1.00 mg/l indicate a low risk for developing CHD, levels ranging between 1.00 mg/l and 3.00 mg/l indicate an elevated risk, and levels exceeding 3.00 mg/l indicate high risk. Multiple Genome-Wide Association Studies (GWAS) have identified a number of genetic polymorphisms which influence CRP levels. SNPs implicated in such studies have been found in or near genes of interest including: CRP, APOE, APOC, IL-6, …


Prevalence Of Piscine Myocarditis Virus (Pmcv) In Marine Fish Species, Torstein Tengs Dr. Jan 2011

Prevalence Of Piscine Myocarditis Virus (Pmcv) In Marine Fish Species, Torstein Tengs Dr.

Dr. Torstein Tengs

No abstract.


Generalized Benjamini-Hochberg Procedures Using Spacings, Debashis Ghosh Jan 2011

Generalized Benjamini-Hochberg Procedures Using Spacings, Debashis Ghosh

Debashis Ghosh

For the problem of multiple testing, the Benjamini-Hochberg (B-H) procedure has become a very popular method in applications. We show how the B-H procedure can be interpreted as a test based on the spacings corresponding to the p-value distributions. Using this equivalence, we develop a class of generalized B-H procedures that maintain control of the false discovery rate in finite-samples. We also consider the effect of correlation on the procedure; simulation studies are used to illustrate the methodology.


Software For Assumption Weighting For Meta-Analysis Of Genomic Data, Debashis Ghosh, Yihan Li Jan 2011

Software For Assumption Weighting For Meta-Analysis Of Genomic Data, Debashis Ghosh, Yihan Li

Debashis Ghosh

This is the software that accompanies Li and Ghosh, "Assumption weighting for incorporating heterogeneity into meta-analysis of genomic data."


A Causal Framework For Surrogate Endpoints With Semi-Competing Risks Data, Debashis Ghosh Jan 2011

A Causal Framework For Surrogate Endpoints With Semi-Competing Risks Data, Debashis Ghosh

Debashis Ghosh

In this note, we address the problem of surrogacy using a causal modelling framework that differs substantially from the potential outcomes model that pervades the biostatistical literature. The framework comes from econometrics and conceptualizes direct effects of the surrogate endpoint on the true endpoint. While this framework can incorporate the so-called semi-competing risks data structure, we also derive a fundamental non-identifiability result. Relationships to existing causal modelling frameworks are also discussed.


Propensity Score Modelling In Observational Studies Using Dimension Reduction Methods, Debashis Ghosh Jan 2011

Propensity Score Modelling In Observational Studies Using Dimension Reduction Methods, Debashis Ghosh

Debashis Ghosh

Conditional independence assumptions are very important in causal inference modelling as well as in dimension reduction methodologies. These are two very strikingly different statistical literatures, and we study links between the two in this article. The concept of covariate sufficiency plays an important role, and we provide theoretical justication when dimension reduction and partial least squares methods will allow for valid causal inference to be performed. The methods are illustrated with application to a medical study and to simulated data.


A Novel Totivirus And Piscine Reovirus (Prv) In Atlantic Salmon (Salmo Salar) With Cardiomyopathy Syndrome (Cms), Torstein Tengs Nov 2010

A Novel Totivirus And Piscine Reovirus (Prv) In Atlantic Salmon (Salmo Salar) With Cardiomyopathy Syndrome (Cms), Torstein Tengs

Dr. Torstein Tengs

BACKGROUNDCardiomyopathy syndrome (CMS) is a severe disease affecting large farmed Atlantic salmon. Mortality often appears without prior clinical signs, typically shortly prior to slaughter. We recently reported the finding and the complete genomic sequence of a novel piscine reovirus (PRV), which is associated with another cardiac disease in Atlantic salmon; heart and skeletal muscle inflammation (HSMI). In the present work we have studied whether PRV or other infectious agents may be involved in the etiology of CMS.RESULTSUsing high throughput sequencing on heart samples from natural outbreaks of CMS and from fish experimentally challenged with material from fish diagnosed with CMS …


Heart And Skeletal Muscle Inflammation Of Farmed Salmon Is Associated With Infection With A Novel Reovirus, Torstein Tengs Jul 2010

Heart And Skeletal Muscle Inflammation Of Farmed Salmon Is Associated With Infection With A Novel Reovirus, Torstein Tengs

Dr. Torstein Tengs

Atlantic salmon (Salmo salar L.) mariculture has been associated with epidemics of infectious diseases that threaten not only local production, but also wild fish coming into close proximity to marine pens and fish escaping from them. Heart and skeletal muscle inflammation (HSMI) is a frequently fatal disease of farmed Atlantic salmon. First recognized in one farm in Norway in 1999, HSMI was subsequently implicated in outbreaks in other farms in Norway and the United Kingdom. Although pathology and disease transmission studies indicated an infectious basis, efforts to identify an agent were unsuccessful. Here we provide evidence that HSMI is associated …


Non-Prejudiced Detection And Characterization Of Genetic Modifications, Torstein Tengs Jun 2010

Non-Prejudiced Detection And Characterization Of Genetic Modifications, Torstein Tengs

Dr. Torstein Tengs

The application of gene technology is becoming widespread much thanks to the rapid increase in technology, resource, and knowledge availability. Consequently, the diversity and number of genetically modified organisms (GMOs) that may find their way into the food chain or the environment, intended or unintended, is rapidly growing. From a safety point of view the ability to detect and characterize in detail any GMO, independent of publicly available information, is fundamental. Pre-release risk assessments of GMOs are required in most jurisdictions and are usually based on application of technologies with limited ability to detect unexpected rearrangements and insertions. We present …


Comparison Of Nine Different Real-Time Pcr Chemistries For Qualitative And Quantitative Applications In Gmo Detection, Torstein Tengs Mar 2010

Comparison Of Nine Different Real-Time Pcr Chemistries For Qualitative And Quantitative Applications In Gmo Detection, Torstein Tengs

Dr. Torstein Tengs

Several techniques have been developed for detection and quantification of genetically modified organisms, but quantitative real-time PCR is by far the most popular approach. Among the most commonly used realtime PCR chemistries are TaqMan probes and SYBR green, but many other detection chemistries have also been developed. Because their performance has never been compared systematically, here we present an extensive evaluation of some promising chemistries: sequenceunspecific DNA labeling dyes (SYBR green), primer-based technologies (AmpliFluor, Plexor, Lux primers), and techniques involving double-labeled probes, comprising hybridization (molecular beacon) and hydrolysis (TaqMan, CPT, LNA, and MGB) probes, based on recently published experimental data. …


Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh Jan 2010

Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh

Debashis Ghosh

In high-throughput studies involving genetic data such as from gene expression mi- croarrays, dierential expression analysis between two or more experimental conditions has been a very common analytical task. Much of the resulting literature on multiple comparisons has paid relatively little attention to the choice of test statistic. In this article, we focus on the issue of choice of test statistic based on a special pattern of dierential expression. The approach here is based on recasting multiple comparisons procedures for assessing outlying expression values. A major complication is that the resulting p-values are discrete; some theoretical properties of sequential testing …


Detecting Outlier Genes From High-Dimensional Data: A Fuzzy Approach, Debashis Ghosh Jan 2010

Detecting Outlier Genes From High-Dimensional Data: A Fuzzy Approach, Debashis Ghosh

Debashis Ghosh

A recent nding in cancer research has been the characterization of previously undis- covered chromosomal abnormalities in several types of solid tumors. This was found based on analyses of high-throughput data from gene expression microarrays and motivated the development of so-called `outlier' tests for dierential expression. One statistical issue was the potential discreteness of the test statistics. Using ideas from fuzzy set theory, we develop fuzzy outlier detection algorithms that have links to ideas in multiple comparisons. Two- and K-sample extensions are considered. The methodology is illustrated by application to two microarray studies.


Links Between Analysis Of Surrogate Endpoints And Endogeneity, Debashis Ghosh, Jeremy M. Taylor, Michael R. Elliott Jan 2010

Links Between Analysis Of Surrogate Endpoints And Endogeneity, Debashis Ghosh, Jeremy M. Taylor, Michael R. Elliott

Debashis Ghosh

There has been substantive interest in the assessment of surrogate endpoints in medical research. These are measures which could potentially replace \true" endpoints in clinical trials and lead to studies that require less follow-up. Recent research in the area has focused on assessments using causal inference frameworks. Beginning with a simple model for associating the surrogate and true endpoints in the population, we approach the problem as one of endogenous covariates. An instrumental variables estimator and general two-stage algorithm is proposed. Existing surrogacy frameworks are then evaluated in the context of the model. A numerical example is used to illustrate …


Meta-Analysis For Surrogacy: Accelerated Failure Time Models And Semicompeting Risks Modelling, Debashis Ghosh, Jeremy M. Taylor, Daniel J. Sargent Jan 2010

Meta-Analysis For Surrogacy: Accelerated Failure Time Models And Semicompeting Risks Modelling, Debashis Ghosh, Jeremy M. Taylor, Daniel J. Sargent

Debashis Ghosh

There has been great recent interest in the medical and statistical literature in the assessment and validation of surrogate endpoints as proxies for clinical endpoints in medical studies. More recently, authors have focused on using meta-analytical methods for quanti cation of surrogacy. In this article, we extend existing procedures for analysis based on the accelerated failure time model to this setting. An advantage of this approach relative to proportional hazards model is that it allows for analysis in the semi-competing risks setting, where we constrain the surrogate endpoint to occur before the true endpoint. A novel principal components procedure is …


Spline-Based Models For Predictiveness Curves, Debashis Ghosh, Michael Sabel Jan 2010

Spline-Based Models For Predictiveness Curves, Debashis Ghosh, Michael Sabel

Debashis Ghosh

A biomarker is dened to be a biological characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention. The use of biomarkers in cancer has been advocated for a variety of purposes, which include use as surrogate endpoints, early detection of disease, proxies for environmental exposure and risk prediction. We deal with the latter issue in this paper. Several authors have proposed use of the predictiveness curve for assessing the capacity of a biomarker for risk prediction. For most situations, it is reasonable to assume monotonicity of …


Combining Multiple Models With Survival Data: The Phase Algorithm, Debashis Ghosh, Zheng Yuan Jan 2010

Combining Multiple Models With Survival Data: The Phase Algorithm, Debashis Ghosh, Zheng Yuan

Debashis Ghosh

In many scientic studies, one common goal is to develop good prediction rules based on a set of available measurements. This paper proposes a model averaging methodology using proportional hazards regression models to construct new estimators of predicted survival probabilities. A screening step based on an adaptive searching algorithm is used to handle large numbers of covariates. The nite-sample properties of the proposed methodology is assessed using simulation studies. Application of the method to a cancer biomarker study is also given.


Semiparametric Analysis Of Recurrent Events: Artificial Censoring, Truncation, Pairwise Estimation And Inference, Debashis Ghosh Dec 2009

Semiparametric Analysis Of Recurrent Events: Artificial Censoring, Truncation, Pairwise Estimation And Inference, Debashis Ghosh

Debashis Ghosh

The analysis of recurrent failure time data from longitudinal studies can be complicated by the presence of dependent censoring. There has been a substantive literature that has developed based on an artificial censoring device. We explore in this article the connection between this class of methods with truncated data structures. In addition, a new procedure is developed for estimation and inference in a joint model for recurrent events and dependent censoring. Estimation proceeds using a mixed U-statistic based estimating function approach. New resampling-based methods for variance estimation and model checking are also described. The methods are illustrated by application to …