Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Bayesian Model Averaging and Semiparametric Regression (7)
- Copula Modeling (6)
- Functional Data Analysis (6)
- Statistical Models (5)
- Forecasting and Time Series (4)
-
- Genomics (4)
- Proteomics (4)
- Image Analysis (3)
- Multivariate Models in Marketing (3)
- Statistical Theory and Methods (3)
- Biomarkers (2)
- Copulas (2)
- Functional Mixed Models (2)
- Spot detection (2)
- validation. (1)
- 2-D gel electrophoresis (1)
- 2D Gel Electrophoresis (1)
- 2D gel electrophoresis (1)
- Air pollution; Functional data analysis; Markov chain Monte Carlo; Mixture prior; Panel study; Particulate matter; Wavelets. (1)
- Archimedian Copula (1)
- Asymmetric Dependence (1)
- Bayesian Cross-Validation (1)
- Bayesian Estimation; Discrete Copula; Markov chain Monte Carlo; Gaussian Copula; Media Modeling; Probability Models; Website Page Views (1)
- Bayesian Model Selection (1)
- Bayesian Modeling (1)
- Bayesian Pair-Copula Selection (1)
- Bayesian Variable Selection; Nonparametric Regression; g-prior with point mass; Efficient Gibbs Sampler (1)
- Bayesian inference; Function-on-function regression; Functional data analysis; Functional mixed models; Wavelet regression (1)
- Bayesian methods (1)
- Bayesian methods; Comparative Genomic Hybridization; Copy number; Functional data analysis; Mixed Models; Mixture Models (1)
- Publication Year
Articles 1 - 27 of 27
Full-Text Articles in Statistical Models
Implicit Copulas From Bayesian Regularized Regression Smoothers, Nadja Klein, Michael S. Smith
Implicit Copulas From Bayesian Regularized Regression Smoothers, Nadja Klein, Michael S. Smith
Michael Stanley Smith
Variational Bayes Estimation Of Discrete-Margined Copula Models With Application To Ime Series, Ruben Loaiza-Maya, Michael S. Smith
Variational Bayes Estimation Of Discrete-Margined Copula Models With Application To Ime Series, Ruben Loaiza-Maya, Michael S. Smith
Michael Stanley Smith
Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson
Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson
Jeffrey S. Morris
High-dimensional data with hundreds of thousands of observations are becoming commonplace in many disciplines. The analysis of such data poses many computational challenges, especially when the observations are correlated over time and/or across space. In this paper we propose exible hierarchical regression models for analyzing such data that accommodate serial and/or spatial correlation. We address the computational challenges involved in fitting these models by adopting an approximate inference framework. We develop an online variational Bayes algorithm that works by incrementally reading the data into memory one portion at a time. The performance of the method is assessed through simulation studies. …
Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris
Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris
Jeffrey S. Morris
Medical and public health research increasingly involves the collection of more and more complex and high dimensional data. In particular, functional data|where the unit of observation is a curve or set of curves that are finely sampled over a grid -- is frequently obtained. Moreover, researchers often sample multiple curves per person resulting in repeated functional measures. A common question is how to analyze the relationship between two functional variables. We propose a general function-on-function regression model for repeatedly sampled functional data, presenting a simple model as well as a more extensive mixed model framework, along with multiple functional posterior …
Functional Regression, Jeffrey S. Morris
Functional Regression, Jeffrey S. Morris
Jeffrey S. Morris
Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and …
Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr.
Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr.
Blair T. Johnson
In any scientific discipline, the ability to portray research patterns graphically often aids greatly in interpreting a phenomenon. In part to depict phenomena, the statistics and capabilities of meta-analytic models have grown increasingly sophisticated. Accordingly, this article details how to move the constant in weighted meta-analysis regression models (viz. “meta-regression”) to illuminate the patterns in such models across a range of complexities. Although it is commonly ignored in practice, the constant (or intercept) in such models can be indispensible when it is not relegated to its usual static role. The moving constant technique makes possible estimates and confidence intervals at …
From Amazon To Apple: Modeling Online Retail Sales, Purchase Incidence And Visit Behavior, Anastasios Panagiotelis, Michael S. Smith, Peter Danaher
From Amazon To Apple: Modeling Online Retail Sales, Purchase Incidence And Visit Behavior, Anastasios Panagiotelis, Michael S. Smith, Peter Danaher
Michael Stanley Smith
In this study we propose a multivariate stochastic model for website visit duration, page views, purchase incidence and the sale amount for online retailers. The model is constructed by composition from carefully selected distributions, and involves copula components. It allows for the strong nonlinear relationships between the sales and visit variables to be explored in detail, and can be used to construct sales predictions. The model is readily estimated using maximum likelihood, making it an attractive choice in practice given the large sample sizes that are commonplace in online retail studies. We examine a number of top-ranked U.S. online retailers, …
Spectral Density Shrinkage For High-Dimensional Time Series, Mark Fiecas, Rainer Von Sachs
Spectral Density Shrinkage For High-Dimensional Time Series, Mark Fiecas, Rainer Von Sachs
Mark Fiecas
Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi
Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi
Jeffrey S. Morris
Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.
Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.
Design: A single-arm, phase II trial.
Patients: Twenty-seven patients with FAP.
Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.
Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were …
Bayesian Approaches To Copula Modelling, Michael S. Smith
Bayesian Approaches To Copula Modelling, Michael S. Smith
Michael Stanley Smith
Copula models have become one of the most widely used tools in the applied modelling of multivariate data. Similarly, Bayesian methods are increasingly used to obtain efficient likelihood-based inference. However, to date, there has been only limited use of Bayesian approaches in the formulation and estimation of copula models. This article aims to address this shortcoming in two ways. First, to introduce copula models and aspects of copula theory that are especially relevant for a Bayesian analysis. Second, to outline Bayesian approaches to formulating and estimating copula models, and their advantages over alternative methods. Copulas covered include Archimedean, copulas constructed …
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Jeffrey S. Morris
In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational …
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Jeffrey S. Morris
Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.
Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …
Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn
Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn
Michael Stanley Smith
[THIS IS AN AUGUST 2010 REVISION THAT REPLACES ALL PREVIOUS VERSIONS.]
We construct a copula from the skew t distribution of Sahu, Dey & Branco (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discrete-valued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose …
Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled
Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled
Michael Stanley Smith
Estimation of copula models with discrete margins is known to be difficult beyond the bivariate case. We show how this can be achieved by augmenting the likelihood with latent variables, and computing inference using the resulting augmented posterior. To evaluate this we propose two efficient Markov chain Monte Carlo sampling schemes. One generates the latent variables as a block using a Metropolis-Hasting step with a proposal that is close to its target distribution, the other generates them one at a time. Our method applies to all parametric copulas where the conditional copula functions can be evaluated, not just elliptical copulas …
Modeling Multivariate Distributions Using Copulas: Applications In Marketing, Peter J. Danaher, Michael S. Smith
Modeling Multivariate Distributions Using Copulas: Applications In Marketing, Peter J. Danaher, Michael S. Smith
Michael Stanley Smith
In this research we introduce a new class of multivariate probability models to the marketing literature. Known as “copula models”, they have a number of attractive features. First, they permit the combination of any univariate marginal distributions that need not come from the same distributional family. Second, a particular class of copula models, called “elliptical copula”, have the property that they increase in complexity at a much slower rate than existing multivariate probability models as the number of dimensions increase. Third, they are very general, encompassing a number of existing multivariate models, and provide a framework for generating many more. …
Bicycle Commuting In Melbourne During The 2000s Energy Crisis: A Semiparametric Analysis Of Intraday Volumes, Michael S. Smith, Goeran Kauermann
Bicycle Commuting In Melbourne During The 2000s Energy Crisis: A Semiparametric Analysis Of Intraday Volumes, Michael S. Smith, Goeran Kauermann
Michael Stanley Smith
Cycling is attracting renewed attention as a mode of transport in western urban environments, yet the determinants of usage are poorly understood. In this paper we investigate some of these using intraday bicycle volumes collected via induction loops located at ten bike paths in the city of Melbourne, Australia, between December 2005 and June 2008. The data are hourly counts at each location, with temporal and spatial disaggregation allowing for the impact of meteorology to be measured accurately for the first time. Moreover, during this period petrol prices varied dramatically and the data also provide a unique opportunity to assess …
Modeling Longitudinal Data Using A Pair-Copula Decomposition Of Serial Dependence, Michael S. Smith, Aleksey Min, Carlos Almeida, Claudia Czado
Modeling Longitudinal Data Using A Pair-Copula Decomposition Of Serial Dependence, Michael S. Smith, Aleksey Min, Carlos Almeida, Claudia Czado
Michael Stanley Smith
Copulas have proven to be very successful tools for the flexible modelling of cross-sectional dependence. In this paper we express the dependence structure of continuous-valued time series data using a sequence of bivariate copulas. This corresponds to a type of decomposition recently called a ‘vine’ in the graphical models literature, where each copula is entitled a ‘pair-copula’. We propose a Bayesian approach for the estimation of this dependence structure for longitudinal data. Bayesian selection ideas are used to identify any independence pair-copulas, with the end result being a parsimonious representation of a time-inhomogeneous Markov process of varying order. Estimates are …
Wavelet-Based Functional Linear Mixed Models: An Application To Measurement Error–Corrected Distributed Lag Models, Elizabeth J. Malloy, Jeffrey S. Morris, Sara D. Adar, Helen Suh, Diane R. Gold, Brent A. Coull
Wavelet-Based Functional Linear Mixed Models: An Application To Measurement Error–Corrected Distributed Lag Models, Elizabeth J. Malloy, Jeffrey S. Morris, Sara D. Adar, Helen Suh, Diane R. Gold, Brent A. Coull
Jeffrey S. Morris
Frequently, exposure data are measured over time on a grid of discrete values that collectively define a functional observation. In many applications, researchers are interested in using these measurements as covariates to predict a scalar response in a regression setting, with interest focusing on the most biologically relevant time window of exposure. One example is in panel studies of the health effects of particulate matter (PM), where particle levels are measured over time. In such studies, there are many more values of the functional data than observations in the data set so that regularization of the corresponding functional regression coefficient …
Members’ Discoveries: Fatal Flaws In Cancer Research, Jeffrey S. Morris
Members’ Discoveries: Fatal Flaws In Cancer Research, Jeffrey S. Morris
Jeffrey S. Morris
A recent article published in The Annals of Applied Statistics (AOAS) by two MD Anderson researchers—Keith Baggerly and Kevin Coombes—dissects results from a highly-influential series of medical papers involving genomics-driven personalized cancer therapy, and outlines a series of simple yet fatal flaws that raises serious questions about the veracity of the original results. Having immediate and strong impact, this paper, along with related work, is providing the impetus for new standards of reproducibility in scientific research.
Statistical Contributions To Proteomic Research, Jeffrey S. Morris, Keith A. Baggerly, Howard B. Gutstein, Kevin R. Coombes
Statistical Contributions To Proteomic Research, Jeffrey S. Morris, Keith A. Baggerly, Howard B. Gutstein, Kevin R. Coombes
Jeffrey S. Morris
Proteomic profiling has the potential to impact the diagnosis, prognosis, and treatment of various diseases. A number of different proteomic technologies are available that allow us to look at many proteins at once, and all of them yield complex data that raise significant quantitative challenges. Inadequate attention to these quantitative issues can prevent these studies from achieving their desired goals, and can even lead to invalid results. In this chapter, we describe various ways the involvement of statisticians or other quantitative scientists in the study team can contribute to the success of proteomic research, and we outline some of the …
Informatics And Statistics For Analyzing 2-D Gel Electrophoresis Images, Andrew W. Dowsey, Jeffrey S. Morris, Howard G. Gutstein, Guang Z. Yang
Informatics And Statistics For Analyzing 2-D Gel Electrophoresis Images, Andrew W. Dowsey, Jeffrey S. Morris, Howard G. Gutstein, Guang Z. Yang
Jeffrey S. Morris
Whilst recent progress in ‘shotgun’ peptide separation by integrated liquid chromatography and mass spectrometry (LC/MS) has enabled its use as a sensitive analytical technique, proteome coverage and reproducibility is still limited and obtaining enough replicate runs for biomarker discovery is a challenge. For these reasons, recent research demonstrates the continuing need for protein separation by two-dimensional gel electrophoresis (2-DE). However, with traditional 2-DE informatics, the digitized images are reduced to symbolic data though spot detection and quantification before proteins are compared for differential expression by spot matching. Recently, a more robust and automated paradigm has emerged where gels are directly …
Bayesian Random Segmentationmodels To Identify Shared Copy Number Aberrations For Array Cgh Data, Veerabhadran Baladandayuthapani, Yuan Ji, Rajesh Talluri, Luis E. Nieto-Barajas, Jeffrey S. Morris
Bayesian Random Segmentationmodels To Identify Shared Copy Number Aberrations For Array Cgh Data, Veerabhadran Baladandayuthapani, Yuan Ji, Rajesh Talluri, Luis E. Nieto-Barajas, Jeffrey S. Morris
Jeffrey S. Morris
Array-based comparative genomic hybridization (aCGH) is a high-resolution high-throughput technique for studying the genetic basis of cancer. The resulting data consists of log fluorescence ratios as a function of the genomic DNA location and provides a cytogenetic representation of the relative DNA copy number variation. Analysis of such data typically involves estimation of the underlying copy number state at each location and segmenting regions of DNA with similar copy number states. Most current methods proceed by modeling a single sample/array at a time, and thus fail to borrow strength across multiple samples to infer shared regions of copy number aberrations. …
Bayesian Inference For A Periodic Stochastic Volatility Model Of Intraday Electricity Prices, Michael S. Smith
Bayesian Inference For A Periodic Stochastic Volatility Model Of Intraday Electricity Prices, Michael S. Smith
Michael Stanley Smith
The Gaussian stochastic volatility model is extended to allow for periodic autoregressions (PAR) in both the level and log-volatility process. Each PAR is represented as a first order vector autoregression for a longitudinal vector of length equal to the period. The periodic stochastic volatility model is therefore expressed as a multivariate stochastic volatility model. Bayesian posterior inference is computed using a Markov chain Monte Carlo scheme for the multivariate representation. A circular prior that exploits the periodicity is suggested for the log-variance of the log-volatilities. The approach is applied to estimate a periodic stochastic volatility model for half-hourly electricity prices …
Bayesian Skew Selection For Multivariate Models, Michael S. Smith, Anastasios Panagiotelis
Bayesian Skew Selection For Multivariate Models, Michael S. Smith, Anastasios Panagiotelis
Michael Stanley Smith
We develop a Bayesian approach for the selection of skew in multivariate skew t distributions constructed through hidden conditioning in the manners suggested by either Azzalini and Capitanio (2003) or Sahu, Dey and Branco~(2003). We show that the skew coefficients for each margin are the same for the standardized versions of both distributions. We introduce binary indicators to denote whether there is symmetry, or skew, in each dimension. We adopt a proper beta prior on each non-zero skew coefficient, and derive the corresponding prior on the skew parameters. In both distributions we show that as the degrees of freedom increases, …
A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles
A Statistical Framework For The Analysis Of Chip-Seq Data, Pei Fen Kuan, Dongjun Chung, Guangjin Pan, James A. Thomson, Ron Stewart, Sunduz Keles
Sunduz Keles
Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has revolutionalized experiments for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosome occupancy. As the cost of sequencing is decreasing, many researchers are switching from microarray-based technologies (ChIP-chip) to ChIP-Seq for genome-wide study of transcriptional regulation. Despite its increasing and well-deserved popularity, there is little work that investigates and accounts for sources of biases in the ChIP-Seq technology. These biases typically arise from both the standard pre-processing protocol and the underlying DNA sequence of the generated data.
We study data from a naked DNA sequencing experiment, which sequences non-cross-linked DNA after deproteinizing and …
Additive Nonparametric Regression With Autocorrelated Errors, Michael S. Smith, C Wong, Robert Kohn
Additive Nonparametric Regression With Autocorrelated Errors, Michael S. Smith, C Wong, Robert Kohn
Michael Stanley Smith
A Bayesian approach is presented for nonparametric estimation of an additive regression model with autocorrelated errors. Each of the potentially nonlinear components is modelled as a regression spline using many knots, while the errors are modelled by a high order stationary autoregressive process parameterised in terms of its autocorrelations. The distribution of significant knots and partial autocorrelations is accounted for using subset selection. Our approach also allows the selection of a suitable transformation of the dependent variable. All aspects of the model are estimated simultaneously using Markov chain Monte Carlo. It is shown empirically that the proposed approach works well …
A Bayesian Approach To Additive Nonparametric Regression, Michael S. Smith, Robert Kohn
A Bayesian Approach To Additive Nonparametric Regression, Michael S. Smith, Robert Kohn
Michael Stanley Smith
This proceedings paper was the first to suggest using a Gaussian g-prior combined with a point mass to undertake Bayesian variable selection in a Gaussian linear regression model. It also was the first to suggest integrating out the regression parameters and variance in closed form, resulting in an efficient Gibbs sampling scheme. The idea was applied to estimate regression functions in an additive model by using a linear basis expansion for each component function in an additive model. The conference proceeding was eventually published in a slightly tighter form in Journal of Econometrics (1996).