Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 32

Full-Text Articles in Physical Sciences and Mathematics

A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles Nov 2009

A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles

Sunduz Keles

Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has revolutionalized experiments for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosome occupancy. As the cost of sequencing is decreasing, many researchers are switching from microarray-based technologies (ChIP-chip) to ChIP-Seq for genome-wide study of transcriptional regulation. Despite its increasing and well-deserved popularity, there is little work that investigates and accounts for sources of biases in the ChIP-Seq technology. These biases typically arise from both the standard pre-processing protocol and the underlying DNA sequence of the generated data.

We study data from a naked DNA sequencing experiment, which sequences non-cross-linked DNA after deproteinizing and …


Integrative Analysis Of Cancer Genomic Data, Shuangge Ma Sep 2009

Integrative Analysis Of Cancer Genomic Data, Shuangge Ma

Shuangge Ma

In the past decade, we have witnessed a period of unparallel development in the field of cancer genomics. To address the same or similar biomedical questions, multiple cancer genomic studies have been independently designed and conducted. Cancer gene signatures identified from analysis of individual datasets often have low reproducibility. A cost-effective way of improving reproducibility is to conduct integrative analysis of datasets from multiple studies with comparable designs. To properly integrate multiple studies and conduct integrative analysis, we need to access various public data warehouses, retrieve experiment protocols and raw data, evaluate individual studies and select those with comparable designs, …


Identification Of Cancer-Associated Gene Pathways From Analysis Of Expression Data, Shuangge Ma Aug 2009

Identification Of Cancer-Associated Gene Pathways From Analysis Of Expression Data, Shuangge Ma

Shuangge Ma

No abstract provided.


Lecture 5, Shuangge Ma Jun 2009

Lecture 5, Shuangge Ma

Shuangge Ma

No abstract provided.


Final Project, Shuangge Ma Jun 2009

Final Project, Shuangge Ma

Shuangge Ma

No abstract provided.


Lecture 4, Shuangge Ma Jun 2009

Lecture 4, Shuangge Ma

Shuangge Ma

No abstract provided.


Lecture 4, Shuangge Ma Jun 2009

Lecture 4, Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 13, Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 13, Shuangge Ma

Shuangge Ma

No abstract provided.


Final Project (Description), Shuangge Ma Jun 2009

Final Project (Description), Shuangge Ma

Shuangge Ma

No abstract provided.


Final Project (Data), Shuangge Ma Jun 2009

Final Project (Data), Shuangge Ma

Shuangge Ma

No abstract provided.


Lecture 3, Shuangge Ma Jun 2009

Lecture 3, Shuangge Ma

Shuangge Ma

No abstract provided.


Lecture 2, Shuangge Ma Jun 2009

Lecture 2, Shuangge Ma

Shuangge Ma

No abstract provided.


Reference: Multiple Imputation, Shuangge Ma Jun 2009

Reference: Multiple Imputation, Shuangge Ma

Shuangge Ma

No abstract provided.


Reference: Weighted Bootstrap, Shuangge Ma Jun 2009

Reference: Weighted Bootstrap, Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 9, Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 9, Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 8, Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 8, Shuangge Ma

Shuangge Ma

No abstract provided.


Reference: Counter Examples [Bootstrap], Shuangge Ma Jun 2009

Reference: Counter Examples [Bootstrap], Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 7 (Lab 2), Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 7 (Lab 2), Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 6, Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 6, Shuangge Ma

Shuangge Ma

No abstract provided.


Reference: Block Jackknife, Shuangge Ma Jun 2009

Reference: Block Jackknife, Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 5, Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 5, Shuangge Ma

Shuangge Ma

No abstract provided.


Reading: Simulate Multivariate Distribution, Shuangge Ma Jun 2009

Reading: Simulate Multivariate Distribution, Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 4, Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 4, Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 3 (Lab 1), Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 3 (Lab 1), Shuangge Ma

Shuangge Ma

No abstract provided.


Computer Intensive Methods Lecture 2, Shuangge Ma Jun 2009

Computer Intensive Methods Lecture 2, Shuangge Ma

Shuangge Ma

No abstract provided.


A Tale Of Two Streets: Incorporating Grouping Structure In High Dimensional Data Mining, Shuangge Ma Jun 2009

A Tale Of Two Streets: Incorporating Grouping Structure In High Dimensional Data Mining, Shuangge Ma

Shuangge Ma

No abstract provided.


Local Spectral Analysis Via A Bayesian Mixture Of Smoothing Splines” Journal Of The American Statistical Association, Sally Wood, Ori Rosen, David Stoffer Dec 2008

Local Spectral Analysis Via A Bayesian Mixture Of Smoothing Splines” Journal Of The American Statistical Association, Sally Wood, Ori Rosen, David Stoffer

Sally Wood

No abstract provided.


Identification Of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests, Yuanyuan Xiao, Mark Segal Dec 2008

Identification Of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests, Yuanyuan Xiao, Mark Segal

Mark R Segal

The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression …


Trans-Dimensional Metropolis-Hastings Using Parallel Chains, Sally Wood, James Pullen, Robert Kohn, David Leslie Dec 2008

Trans-Dimensional Metropolis-Hastings Using Parallel Chains, Sally Wood, James Pullen, Robert Kohn, David Leslie

Sally Wood

A general Bayesian sampling method is developed that uses parallel chains to select between models and to average the predictive density over such models. The method applies to both non-nested models and to nested models, and is particularly useful for mixtures of complex component models, where a novel approach to overcome the label-switching problem is used. The method is illustrated with real and simulated data in model-averaging over alternative financial time series models, mixtures of normal distributions, and mixtures of smoothing spline models.


Priors For A Bayesian Analysis Of Extreme Values, Sally Wood, Julian Wang Dec 2008

Priors For A Bayesian Analysis Of Extreme Values, Sally Wood, Julian Wang

Sally Wood

This article proposes a new prior specification for a Bayesian analysis of the k largest order statistics model. We show that using Jeffreys priors for the end-point and shape parameters of the k largest order statistics model leads to biased estimates of the shape parameter for small to medium sample sizes and to the posterior mode of the end-point being equal to the most extreme observed value. We propose a conjugate prior for the shape parameter and a prior for the end-point which removes the posterior mode at the most extreme observed value while remaining uninformative for values of the …