Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Longitudinal Data Analysis and Time Series

Selected Works

Articles 1 - 30 of 40

Full-Text Articles in Statistical Models

Inversion Copulas From Nonlinear State Space Models With An Application To Inflation Forecasting, Michael S. Smith, Worapree Ole Maneesoonthorn May 2018

Inversion Copulas From Nonlinear State Space Models With An Application To Inflation Forecasting, Michael S. Smith, Worapree Ole Maneesoonthorn

Michael Stanley Smith

We propose the construction of copulas through the inversion of nonlinear state space models. These copulas allow for new time series models that have the same serial dependence structure as a state space model, but with an arbitrary marginal distribution, and flexible density forecasts. We examine the time series properties of the copulas, outline serial dependence measures, and estimate the models using likelihood-based methods. Copulas constructed from three example state space models are considered: a stochastic volatility model with an unobserved component, a Markov switching autoregression, and a Gaussian linear unobserved component model. We show that all three inversion copulas …


Variational Bayes Estimation Of Discrete-Margined Copula Models With Application To Ime Series, Ruben Loaiza-Maya, Michael S. Smith Nov 2017

Variational Bayes Estimation Of Discrete-Margined Copula Models With Application To Ime Series, Ruben Loaiza-Maya, Michael S. Smith

Michael Stanley Smith

We propose a new variational Bayes estimator for high-dimensional copulas with discrete, or a combination of discrete and continuous, margins. The method is based on a variational approximation to a tractable augmented posterior, and is faster than previous likelihood-based approaches. We use it to estimate drawable vine copulas for univariate and multivariate Markov ordinal and mixed time series. These have dimension $rT$, where $T$ is the number of observations and $r$ is the number of series, and are difficult to estimate using previous methods. 
The vine pair-copulas are carefully selected to allow for heteroskedasticity, which is a feature of most ordinal …


Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson Jan 2016

Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson

Jeffrey S. Morris

High-dimensional data with hundreds of thousands of observations are becoming commonplace in many disciplines. The analysis of such data poses many computational challenges, especially when the observations are correlated over time and/or across space. In this paper we propose exible hierarchical regression models for analyzing such data that accommodate serial and/or spatial correlation. We address the computational challenges involved in fitting these models by adopting an approximate inference framework. We develop an online variational Bayes algorithm that works by incrementally reading the data into memory one portion at a time. The performance of the method is assessed through simulation studies. …


Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris Jan 2016

Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris

Jeffrey S. Morris

We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on …


Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris Jan 2015

Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris

Jeffrey S. Morris

Medical and public health research increasingly involves the collection of more and more complex and high dimensional data. In particular, functional data|where the unit of observation is a curve or set of curves that are finely sampled over a grid -- is frequently obtained. Moreover, researchers often sample multiple curves per person resulting in repeated functional measures. A common question is how to analyze the relationship between two functional variables. We propose a general function-on-function regression model for repeatedly sampled functional data, presenting a simple model as well as a more extensive mixed model framework, along with multiple functional posterior …


Functional Regression, Jeffrey S. Morris Jan 2015

Functional Regression, Jeffrey S. Morris

Jeffrey S. Morris

Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and …


On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang Jan 2014

On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang

Chongzhi Di

In parametric models, when one or more parameters disappear under the null hypothesis, the likelihood ratio test statistic does not converge to chi-square distributions. Rather, its limiting distribution is shown to be equivalent to that of the supremum of a squared Gaussian process. However, the limiting distribution is analytically intractable for most of examples, and approximation or simulation based methods must be used to calculate the p values. In this article, we investigate conditions under which the asymptotic distributions have analytically tractable forms, based on the principal component decomposition of Gaussian processes. When these conditions are not satisfied, the principal …


Garma Toolbox For Matlab, Mehdi Jalalpour Jan 2014

Garma Toolbox For Matlab, Mehdi Jalalpour

Mehdi Jalalpour

No abstract provided.


Spectral Density Shrinkage For High-Dimensional Time Series, Mark Fiecas, Rainer Von Sachs Dec 2013

Spectral Density Shrinkage For High-Dimensional Time Series, Mark Fiecas, Rainer Von Sachs

Mark Fiecas

Time series data obtained from neurophysiological signals is often high-dimensional and the length of the time series is often short relative to the number of dimensions. Thus, it is difficult or sometimes impossible to compute statistics that are based on the spectral density matrix because these matrices are numerically unstable. In this work, we discuss the importance of regularization for spectral analysis of high-dimensional time series and propose shrinkage estimation for estimating high-dimensional spectral density matrices. The shrinkage estimator is derived from a penalized log-likelihood, and the optimal penalty parameter has a closed-form solution, which can be estimated using the …


Hierarchical Vector Auto-Regressive Models And Their Applications To Multi-Subject Effective Connectivity, Cristina Gorrostieta, Mark Fiecas, Hernando Ombao, Erin Burke, Steven Cramer Oct 2013

Hierarchical Vector Auto-Regressive Models And Their Applications To Multi-Subject Effective Connectivity, Cristina Gorrostieta, Mark Fiecas, Hernando Ombao, Erin Burke, Steven Cramer

Mark Fiecas

Vector auto-regressive (VAR) models typically form the basis for constructing directed graphical models for investigating connectivity in a brain network with brain regions of interest (ROIs) as nodes. There are limitations in the standard VAR models. The number of parameters in the VAR model increases quadratically with the number of ROIs and linearly with the order of the model and thus due to the large number of parameters, the model could pose serious estimation problems. Moreover, when applied to imaging data, the standard VAR model does not account for variability in the connectivity structure across all subjects. In this paper, …


Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi Jan 2013

Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi

Jeffrey S. Morris

Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.

Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.

Design: A single-arm, phase II trial.

Patients: Twenty-seven patients with FAP.

Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.

Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were …


A Comparison Of Periodic Autoregressive And Dynamic Factor Models In Intraday Energy Demand Forecasting, Thomas Mestekemper, Goeran Kauermann, Michael Smith Dec 2012

A Comparison Of Periodic Autoregressive And Dynamic Factor Models In Intraday Energy Demand Forecasting, Thomas Mestekemper, Goeran Kauermann, Michael Smith

Michael Stanley Smith

We suggest a new approach for forecasting energy demand at an intraday resolution. Demand in each intraday period is modeled using semiparametric regression smoothing to account for calendar and weather components. Residual serial dependence is captured by one of two multivariate stationary time series models, with dimension equal to the number of intraday periods. These are a periodic autoregression and a dynamic factor model. We show the benefits of our approach in the forecasting of district heating demand in a steam network in Germany and aggregate electricity demand in the state of Victoria, Australia. In both studies, accounting for weather …


Bayesian Approaches To Copula Modelling, Michael S. Smith Dec 2012

Bayesian Approaches To Copula Modelling, Michael S. Smith

Michael Stanley Smith

Copula models have become one of the most widely used tools in the applied modelling of multivariate data. Similarly, Bayesian methods are increasingly used to obtain efficient likelihood-based inference. However, to date, there has been only limited use of Bayesian approaches in the formulation and estimation of copula models. This article aims to address this shortcoming in two ways. First, to introduce copula models and aspects of copula theory that are especially relevant for a Bayesian analysis. Second, to outline Bayesian approaches to formulating and estimating copula models, and their advantages over alternative methods. Copulas covered include Archimedean, copulas constructed …


Time Series, Unit Roots, And Cointegration: An Introduction, Lonnie K. Stevans Dec 2012

Time Series, Unit Roots, And Cointegration: An Introduction, Lonnie K. Stevans

Lonnie K. Stevans

The econometric literature on unit roots took off after the publication of the paper by Nelson and Plosser (1982) that argued that most macroeconomic series have unit roots and that this is important for the analysis of macroeconomic policy. Yule (1926) suggested that regressions based on trending time series data can be spurious. This problem of spurious correlation was further pursued by Granger and Newbold (1974) and this also led to the development of the concept of cointegration (lack of cointegration implies spurious regression). The pathbreaking paper by Granger (1981), first presented at a conference at the University of Florida …


Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris Jan 2012

Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris

Jeffrey S. Morris

In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational …


Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do Jan 2012

Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do

Jeffrey S. Morris

Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.

Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …


Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di Jan 2012

Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di

Chongzhi Di

To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …


Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn Dec 2011

Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn

Michael Stanley Smith

[THIS IS AN AUGUST 2010 REVISION THAT REPLACES ALL PREVIOUS VERSIONS.]

We construct a copula from the skew t distribution of Sahu, Dey & Branco (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discrete-valued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose …


Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled Dec 2011

Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled

Michael Stanley Smith

Estimation of copula models with discrete margins is known to be difficult beyond the bivariate case. We show how this can be achieved by augmenting the likelihood with latent variables, and computing inference using the resulting augmented posterior. To evaluate this we propose two efficient Markov chain Monte Carlo sampling schemes. One generates the latent variables as a block using a Metropolis-Hasting step with a proposal that is close to its target distribution, the other generates them one at a time. Our method applies to all parametric copulas where the conditional copula functions can be evaluated, not just elliptical copulas …


Multilevel Latent Class Models With Dirichlet Mixing Distribution, Chong-Zhi Di, Karen Bandeen-Roche Jan 2011

Multilevel Latent Class Models With Dirichlet Mixing Distribution, Chong-Zhi Di, Karen Bandeen-Roche

Chongzhi Di

Latent class analysis (LCA) and latent class regression (LCR) are widely used for modeling multivariate categorical outcomes in social sciences and biomedical studies. Standard analyses assume data of different respondents to be mutually independent, excluding application of the methods to familial and other designs in which participants are clustered. In this paper, we consider multilevel latent class models, in which sub-population mixing probabilities are treated as random effects that vary among clusters according to a common Dirichlet distribution. We apply the Expectation-Maximization (EM) algorithm for model fitting by maximum likelihood (ML). This approach works well, but is computationally intensive when …


Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang Jan 2011

Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang

Chongzhi Di

We consider likelihood ratio tests (LRT) and their modifications for homogeneity in admixture models. The admixture model is a special case of two component mixture model, where one component is indexed by an unknown parameter while the parameter value for the other component is known. It has been widely used in genetic linkage analysis under heterogeneity, in which the kernel distribution is binomial. For such models, it is long recognized that testing for homogeneity is nonstandard and the LRT statistic does not converge to a conventional 2 distribution. In this paper, we investigate the asymptotic behavior of the LRT for …


Rejoinder: Estimation Issues For Copulas Applied To Marketing Data, Peter Danaher, Michael Smith Dec 2010

Rejoinder: Estimation Issues For Copulas Applied To Marketing Data, Peter Danaher, Michael Smith

Michael Stanley Smith

Estimating copula models using Bayesian methods presents some subtle challenges, ranging from specification of the prior to computational tractability. There is also some debate about what is the most appropriate copula to employ from those available. We address these issues here and conclude by discussing further applications of copula models in marketing.


Forecasting Television Ratings, Peter Danaher, Tracey Dagger, Michael Smith Dec 2010

Forecasting Television Ratings, Peter Danaher, Tracey Dagger, Michael Smith

Michael Stanley Smith

Despite the state of flux in media today, television remains the dominant player globally for advertising spend. Since television advertising time is purchased on the basis of projected future ratings, and ad costs have skyrocketed, there is increasing pressure to forecast television ratings accurately. Previous forecasting methods are not generally very reliable and many have not been validated, but more distressingly, none have been tested in today’s multichannel environment. In this study we compare 8 different forecasting models, ranging from a naïve empirical method to a state-of-the-art Bayesian model-averaging method. Our data come from a recent time period, 2004-2008 in …


Windows Executable For Gaussian Copula With Nbd Margins, Michael S. Smith Dec 2010

Windows Executable For Gaussian Copula With Nbd Margins, Michael S. Smith

Michael Stanley Smith

This is an example Windows 32bit program to estimate a Gaussian copula model with NBD margins. The margins are estimated first using MLE, and the copula second using Bayesian MCMC. The model was discussed in Danaher & Smith (2011; Marketing Science) as example 4 (section 4.2).


Modeling Multivariate Distributions Using Copulas: Applications In Marketing, Peter J. Danaher, Michael S. Smith Dec 2010

Modeling Multivariate Distributions Using Copulas: Applications In Marketing, Peter J. Danaher, Michael S. Smith

Michael Stanley Smith

In this research we introduce a new class of multivariate probability models to the marketing literature. Known as “copula models”, they have a number of attractive features. First, they permit the combination of any univariate marginal distributions that need not come from the same distributional family. Second, a particular class of copula models, called “elliptical copula”, have the property that they increase in complexity at a much slower rate than existing multivariate probability models as the number of dimensions increase. Third, they are very general, encompassing a number of existing multivariate models, and provide a framework for generating many more. …


Bicycle Commuting In Melbourne During The 2000s Energy Crisis: A Semiparametric Analysis Of Intraday Volumes, Michael S. Smith, Goeran Kauermann Dec 2010

Bicycle Commuting In Melbourne During The 2000s Energy Crisis: A Semiparametric Analysis Of Intraday Volumes, Michael S. Smith, Goeran Kauermann

Michael Stanley Smith

Cycling is attracting renewed attention as a mode of transport in western urban environments, yet the determinants of usage are poorly understood. In this paper we investigate some of these using intraday bicycle volumes collected via induction loops located at ten bike paths in the city of Melbourne, Australia, between December 2005 and June 2008. The data are hourly counts at each location, with temporal and spatial disaggregation allowing for the impact of meteorology to be measured accurately for the first time. Moreover, during this period petrol prices varied dramatically and the data also provide a unique opportunity to assess …


The Generalized Shrinkage Estimator For The Analysis Of Functional Connectivity Of Brain Signals, Mark Fiecas, Hernando Ombao Dec 2010

The Generalized Shrinkage Estimator For The Analysis Of Functional Connectivity Of Brain Signals, Mark Fiecas, Hernando Ombao

Mark Fiecas

We develop a new statistical method for estimating functional connectivity between neurophysiological signals represented by a multivariate time series. We use partial coherence as the measure of functional connectivity. Partial coherence identifies the frequency bands that drive the direct linear association between any pair of channels. To estimate partial coherence, one would first need an estimate of the spectral density matrix of the multivariate time series. Parametric estimators of the spectral density matrix provide good frequency resolution but could be sensitive when the parametric model is misspecified. Smoothing-based nonparametric estimators are robust to model misspecification and are consistent but may …


Modeling Longitudinal Data Using A Pair-Copula Decomposition Of Serial Dependence, Michael S. Smith, Aleksey Min, Carlos Almeida, Claudia Czado Nov 2010

Modeling Longitudinal Data Using A Pair-Copula Decomposition Of Serial Dependence, Michael S. Smith, Aleksey Min, Carlos Almeida, Claudia Czado

Michael Stanley Smith

Copulas have proven to be very successful tools for the flexible modelling of cross-sectional dependence. In this paper we express the dependence structure of continuous-valued time series data using a sequence of bivariate copulas. This corresponds to a type of decomposition recently called a ‘vine’ in the graphical models literature, where each copula is entitled a ‘pair-copula’. We propose a Bayesian approach for the estimation of this dependence structure for longitudinal data. Bayesian selection ideas are used to identify any independence pair-copulas, with the end result being a parsimonious representation of a time-inhomogeneous Markov process of varying order. Estimates are …


Wavelet-Based Functional Linear Mixed Models: An Application To Measurement Error–Corrected Distributed Lag Models, Elizabeth J. Malloy, Jeffrey S. Morris, Sara D. Adar, Helen Suh, Diane R. Gold, Brent A. Coull Jan 2010

Wavelet-Based Functional Linear Mixed Models: An Application To Measurement Error–Corrected Distributed Lag Models, Elizabeth J. Malloy, Jeffrey S. Morris, Sara D. Adar, Helen Suh, Diane R. Gold, Brent A. Coull

Jeffrey S. Morris

Frequently, exposure data are measured over time on a grid of discrete values that collectively define a functional observation. In many applications, researchers are interested in using these measurements as covariates to predict a scalar response in a regression setting, with interest focusing on the most biologically relevant time window of exposure. One example is in panel studies of the health effects of particulate matter (PM), where particle levels are measured over time. In such studies, there are many more values of the functional data than observations in the data set so that regularization of the corresponding functional regression coefficient …


Members’ Discoveries: Fatal Flaws In Cancer Research, Jeffrey S. Morris Jan 2010

Members’ Discoveries: Fatal Flaws In Cancer Research, Jeffrey S. Morris

Jeffrey S. Morris

A recent article published in The Annals of Applied Statistics (AOAS) by two MD Anderson researchers—Keith Baggerly and Kevin Coombes—dissects results from a highly-influential series of medical papers involving genomics-driven personalized cancer therapy, and outlines a series of simple yet fatal flaws that raises serious questions about the veracity of the original results. Having immediate and strong impact, this paper, along with related work, is providing the impetus for new standards of reproducibility in scientific research.