Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

COBRA

Medicine and Health Sciences

Keyword
Publication Year
Publication

Articles 1 - 18 of 18

Full-Text Articles in Genetics and Genomics

Supervised Dimension Reduction For Large-Scale "Omics" Data With Censored Survival Outcomes Under Possible Non-Proportional Hazards, Lauren Spirko-Burns, Karthik Devarajan Mar 2019

Supervised Dimension Reduction For Large-Scale "Omics" Data With Censored Survival Outcomes Under Possible Non-Proportional Hazards, Lauren Spirko-Burns, Karthik Devarajan

COBRA Preprint Series

The past two decades have witnessed significant advances in high-throughput ``omics" technologies such as genomics, proteomics, metabolomics, transcriptomics and radiomics. These technologies have enabled simultaneous measurement of the expression levels of tens of thousands of features from individual patient samples and have generated enormous amounts of data that require analysis and interpretation. One specific area of interest has been in studying the relationship between these features and patient outcomes, such as overall and recurrence-free survival, with the goal of developing a predictive ``omics" profile. Large-scale studies often suffer from the presence of a large fraction of censored observations and potential …


Estimating The Probability Of Clonal Relatedness Of Pairs Of Tumors In Cancer Patients, Audrey Mauguen, Venkatraman E. Seshan, Irina Ostrovnaya, Colin B. Begg Feb 2017

Estimating The Probability Of Clonal Relatedness Of Pairs Of Tumors In Cancer Patients, Audrey Mauguen, Venkatraman E. Seshan, Irina Ostrovnaya, Colin B. Begg

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Next generation sequencing panels are being used increasingly in cancer research to study tumor evolution. A specific statistical challenge is to compare the mutational profiles in different tumors from a patient to determine the strength of evidence that the tumors are clonally related, i.e. derived from a single, founder clonal cell. The presence of identical mutations in each tumor provides evidence of clonal relatedness, although the strength of evidence from a match is related to how commonly the mutation is seen in the tumor type under investigation. This evidence must be weighed against the evidence in favor of independent tumors …


Conditional Screening For Ultra-High Dimensional Covariates With Survival Outcomes, Hyokyoung Grace Hong, Jian Kang, Yi Li Mar 2016

Conditional Screening For Ultra-High Dimensional Covariates With Survival Outcomes, Hyokyoung Grace Hong, Jian Kang, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Identifying important biomarkers that are predictive for cancer patients' prognosis is key in gaining better insights into the biological influences on the disease and has become a critical component of precision medicine. The emergence of large-scale biomedical survival studies, which typically involve excessive number of biomarkers, has brought high demand in designing efficient screening tools for selecting predictive biomarkers. The vast amount of biomarkers defies any existing variable selection methods via regularization. The recently developed variable screening methods, though powerful in many practical setting, fail to incorporate prior information on the importance of each biomarker and are less powerful in …


Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang Feb 2016

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Computational Model For Survey And Trend Analysis Of Patients With Endometriosis : A Decision Aid Tool For Ebm, Salvo Reina, Vito Reina, Franco Ameglio, Mauro Costa, Alessandro Fasciani Feb 2014

Computational Model For Survey And Trend Analysis Of Patients With Endometriosis : A Decision Aid Tool For Ebm, Salvo Reina, Vito Reina, Franco Ameglio, Mauro Costa, Alessandro Fasciani

COBRA Preprint Series

Endometriosis is increasingly collecting worldwide attention due to its medical complexity and social impact. The European community has identified this as a “social disease”. A large amount of information comes from scientists, yet several aspects of this pathology and staging criteria need to be clearly defined on a suitable number of individuals. In fact, available studies on endometriosis are not easily comparable due to a lack of standardized criteria to collect patients’ informations and scarce definitions of symptoms. Currently, only retrospective surgical stadiation is used to measure pathology intensity, while the Evidence Based Medicine (EBM) requires shareable methods and correct …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond Feb 2007

Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond

UW Biostatistics Working Paper Series

Whole-genome studies are becoming a mainstay of biomedical research. Examples include expression array experiments, comparative genomic hybridization analyses and large case-control studies for detecting polymorphism/disease associations. The tactic of applying a regression model to every locus to obtain test statistics is useful in such studies. However, this approach ignores potential correlation structure in the data that could be used to gain power, particularly when a Bonferroni correction is applied to adjust for multiple testing. In this article, we propose using regression techniques for misspecified multivariate outcomes to increase statistical power over independence-based modeling at each locus. Even when the outcome …


Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng Aug 2006

Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng

Harvard University Biostatistics Working Paper Series

No abstract provided.


Estimation In Semiparametric Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Donglin Zeng, Xihong Lin Aug 2006

Estimation In Semiparametric Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Donglin Zeng, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin Aug 2006

Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin Aug 2006

Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin Aug 2006

A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Effect Of Misreported Family History On Mendelian Mutation Prediction Models, Hormuzd A. Katki Sep 2004

Effect Of Misreported Family History On Mendelian Mutation Prediction Models, Hormuzd A. Katki

Johns Hopkins University, Dept. of Biostatistics Working Papers

People with familial history of disease often consult with genetic counselors about their chance of carrying mutations that increase disease risk. To aid them, genetic counselors use Mendelian models that predict whether the person carries deleterious mutations based on their reported family history. Such models rely on accurate reporting of each member's diagnosis and age of diagnosis, but this information may be inaccurate. Commonly encountered errors in family history can significantly distort predictions, and thus can alter the clinical management of people undergoing counseling, screening, or genetic testing. We derive general results about the distortion in the carrier probability estimate …


Accuracy Of Msi Testing In Predicting Germline Mutations Of Msh2 And Mlh1: A Case Study In Bayesian Meta-Analysis Of Diagnostic Tests Without A Gold Standard, Sining Chen, Patrice Watson, Giovanni Parmigiani Jun 2004

Accuracy Of Msi Testing In Predicting Germline Mutations Of Msh2 And Mlh1: A Case Study In Bayesian Meta-Analysis Of Diagnostic Tests Without A Gold Standard, Sining Chen, Patrice Watson, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

Microsatellite instability (MSI) testing is a common screening procedure used to identify families that may harbor mutations of a mismatch repair gene and therefore may be at high risk for hereditary colorectal cancer. A reliable estimate of sensitivity and specificity of MSI for detecting germline mutations of mismatch repair genes is critical in genetic counseling and colorectal cancer prevention. Several studies published results of both MSI and mutation analysis on the same subjects. In this article we perform a meta-analysis of these studies and obtain estimates that can be directly used in counseling and screening. In particular we estimate the …


Selecting Differentially Expressed Genes From Microarray Experiments, Margaret S. Pepe, Gary M. Longton, Garnet L. Anderson, Michel Schummer Jan 2003

Selecting Differentially Expressed Genes From Microarray Experiments, Margaret S. Pepe, Gary M. Longton, Garnet L. Anderson, Michel Schummer

UW Biostatistics Working Paper Series

High throughput technologies, such as gene expression arrays and protein mass spectrometry, allow one to simultaneously evaluate thousands of potential biomarkers that distinguish different tissue types. Of particular interest here is cancer versus normal organ tissues. We consider statistical methods to rank genes (or proteins) in regards to differential expression between tissues. Various statistical measures are considered and we argue that two measures related to the Receiver Operating Characteristic Curve are particularly suitable for this purpose. We also propose that sampling variability in the gene rankings be quantified and suggest using the “selection probability function”, the probability distribution of rankings …


Comparative Genomic Hybridization Array Analysis, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore Apr 2002

Comparative Genomic Hybridization Array Analysis, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore

U.C. Berkeley Division of Biostatistics Working Paper Series

At the present time, there is increasing evidence that cancer may be regulated by the number of copies of genes in tumor cells. Through microarray technology it is now possible to measure the number of copies of thousands of genes and gene segments in samples of chromosomal DNA. Microarray comparative genomic hybridization (array CGH) provides the opportunity to both measure DNA sequence copy number gains and losses and map these aberrations to the genomic sequence. Gains can signify the over-expression of oncogenes, genes which stimulate cell growth and have become hyperactive, while losses can signify under-expression of tumor suppressor genes, …