Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology Commons

Open Access. Powered by Scholars. Published by Universities.®

1131 Full-Text Articles 1274 Authors 207691 Downloads 56 Institutions

All Articles in Statistical Methodology

Faceted Search

1131 full-text articles. Page 1 of 29.

Mechanistic Mathematical Models: An Underused Platform For Hpv Research, Marc Ryser, Patti Gravitt, Evan R. Myers 2017 George Washington University

Mechanistic Mathematical Models: An Underused Platform For Hpv Research, Marc Ryser, Patti Gravitt, Evan R. Myers

Global Health Faculty Publications

Health economic modeling has become an invaluable methodology for the design and evaluation of clinical and public health interventions against the human papillomavirus (HPV) and associated diseases. At the same time, relatively little attention has been paid to a different yet complementary class of models, namely that of mechanistic mathematical models. The primary focus of mechanistic mathematical models is to better understand the intricate biologic mechanisms and dynamics of disease. Inspired by a long and successful history of mechanistic modeling in other biomedical fields, we highlight several areas of HPV research where mechanistic models have the potential to advance the ...


Interweaving Markov Chain Monte Carlo Strategies For Efficient Estimation Of Dynamic Linear Models, Matthew Simpson, Jarad Niemi, Vivekananda Roy 2017 University of Missouri

Interweaving Markov Chain Monte Carlo Strategies For Efficient Estimation Of Dynamic Linear Models, Matthew Simpson, Jarad Niemi, Vivekananda Roy

Jarad Niemi

In dynamic linear models (DLMs) with unknown fixed parameters, a standard Markov chain Monte Carlo (MCMC) sampling strategy is to alternate sampling of latent states conditional on fixed parameters and sampling of fixed parameters conditional on latent states. In some regions of the parameter space, this standard data augmentation (DA) algorithm can be inefficient. To improve efficiency, we apply the interweaving strategies of Yu and Meng to DLMs. For this, we introduce three novel alternative DAs for DLMs: the scaled errors, wrongly scaled errors, and wrongly scaled disturbances. With the latent states and the less well known scaled disturbances, this ...


Estimation And Prediction In Spatial Models With Block Composite Likelihoods, Jo Eidsvik, Benjamin A. Shaby, Brian J. Reich, Matthew Wheeler, Jarad Niemi 2017 University of Trondeim

Estimation And Prediction In Spatial Models With Block Composite Likelihoods, Jo Eidsvik, Benjamin A. Shaby, Brian J. Reich, Matthew Wheeler, Jarad Niemi

Jarad Niemi

This article develops a block composite likelihood for estimation and prediction in large spatial datasets. The composite likelihood (CL) is constructed from the joint densities of pairs of adjacent spatial blocks. This allows large datasets to be split into many smaller datasets, each of which can be evaluated separately, and combined through a simple summation. Estimates for unknown parameters are obtained by maximizing the block CL function. In addition, a new method for optimal spatial prediction under the block CL is presented. Asymptotic variances for both parameter estimates and predictions are computed using Godambe sandwich matrices. The approach considerably improves ...


Evaluating Modularity In Morphometric Data: Challenges With The Rv Coefficient And A New Test Measure, Dean C. Adams 2017 Iowa State University

Evaluating Modularity In Morphometric Data: Challenges With The Rv Coefficient And A New Test Measure, Dean C. Adams

Dean C. Adams

1: Modularity describes the case where patterns of trait covariation are unevenly dispersed across traits. Specifically, trait correlations are high and concentrated within subsets of variables (modules), but the correlations between traits across modules are relatively weaker. For morphometric datasets, hypotheses of modularity are commonly evaluated using the RV coefficient, an association statistic used in a wide variety of fields. 2: In this article I explore the properties of the RV coefficient using simulated data sets. Using data drawn from a normal distribution where the data were neither modular nor integrated in structure, I show that the RV coefficient is ...


Session D-5: Informal Comparative Inference: What Is It?, Karen Togliatti 2017 Illinois Mathematics and Science Academy

Session D-5: Informal Comparative Inference: What Is It?, Karen Togliatti

Professional Learning Day

Come and experience a hands-on task that has middle-school students grapple with informal inferential reasoning. Three key principles of informal inference – data as evidence, probabilistic language, and generalizing ‘beyond the data’ will be discussed as students build and analyze distributions to answer the question, “Does hand dominance play a role in throwing accuracy?” Connections to the CCSSM statistics standards for middle-school will be highlighted.


Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. van der Laan, Maya L. Petersen 2017 Department of Biostatistics, Harvard T.H. Chan School of Public Heath

Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

WHO guidelines call for universal antiretroviral treatment, and UNAIDS has set a global target to virally suppress most HIV-positive individuals. Accurate estimates of population-level coverage at each step of the HIV care cascade (testing, treatment, and viral suppression) are needed to assess the effectiveness of "test and treat" strategies implemented to achieve this goal. The data available to inform such estimates, however, are susceptible to informative missingness: the number of HIV-positive individuals in a population is unknown; individuals tested for HIV may not be representative of those whom a testing intervention fails to reach, and HIV-positive individuals with a viral ...


Interweaving Markov Chain Monte Carlo Strategies For Efficient Estimation Of Dynamic Linear Models, Matthew Simpson, Jarad Niemi, Vivekananda Roy 2017 University of Missouri

Interweaving Markov Chain Monte Carlo Strategies For Efficient Estimation Of Dynamic Linear Models, Matthew Simpson, Jarad Niemi, Vivekananda Roy

Statistics Publications

In dynamic linear models (DLMs) with unknown fixed parameters, a standard Markov chain Monte Carlo (MCMC) sampling strategy is to alternate sampling of latent states conditional on fixed parameters and sampling of fixed parameters conditional on latent states. In some regions of the parameter space, this standard data augmentation (DA) algorithm can be inefficient. To improve efficiency, we apply the interweaving strategies of Yu and Meng to DLMs. For this, we introduce three novel alternative DAs for DLMs: the scaled errors, wrongly scaled errors, and wrongly scaled disturbances. With the latent states and the less well known scaled disturbances, this ...


The Logic And Limits Of Event Studies In Securities Fraud Litigation, Jill E. Fisch, Jonah B. Gelbach, Jonathan Klick 2017 University of Pennsylvania Law School

The Logic And Limits Of Event Studies In Securities Fraud Litigation, Jill E. Fisch, Jonah B. Gelbach, Jonathan Klick

Faculty Scholarship

Event studies have become increasingly important in securities fraud litigation after the Supreme Court’s decision in Halliburton II. Litigants have used event study methodology, which empirically analyzes the relationship between the disclosure of corporate information and the issuer’s stock price, to provide evidence in the evaluation of key elements of federal securities fraud, including materiality, reliance, causation, and damages. As the use of event studies grows and they increasingly serve a gatekeeping function in determining whether litigation will proceed beyond a preliminary stage, it will be critical for courts to use them correctly.

This Article explores an array ...


Calculating Power By Bootstrap, With An Application To Cluster-Randomized Trials, Ken Kleinman, Susan S. Huang 2017 University of Massachusetts Amherst, School of Public Health and Health Sciences

Calculating Power By Bootstrap, With An Application To Cluster-Randomized Trials, Ken Kleinman, Susan S. Huang

eGEMs (Generating Evidence & Methods to improve patient outcomes)

Background: A key requirement for a useful power calculation is that the calculation mimic the data analysis that will be performed on the actual data, once it is observed. Close approximations may be difficult to achieve using analytic solutions, however, and thus Monte Carlo approaches, including both simulation and bootstrap resampling, are often attractive. One setting in which this is particularly true is cluster-randomized trial designs. However, Monte Carlo approaches are useful in many additional settings as well. Calculating power for cluster-randomized trials using analytic or simulation-based methods is frequently unsatisfactory due to the complexity of the data analysis methods ...


It's All About Balance: Propensity Score Matching In The Context Of Complex Survey Data, David Lenis, Trang Q. ;Nguyen, Nian Dong, Elizabeth A. Stuart 2017 Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health

It's All About Balance: Propensity Score Matching In The Context Of Complex Survey Data, David Lenis, Trang Q. ;Nguyen, Nian Dong, Elizabeth A. Stuart

Johns Hopkins University, Dept. of Biostatistics Working Papers

Many research studies aim to draw causal inferences using data from large, nationally representative survey samples, and many of these studies use propensity score matching to make those causal inferences as rigorous as possible given the non-experimental nature of the data. However, very few applied studies are careful about incorporating the survey design with the propensity score analysis, which may mean that the results don’t generate population inferences. This may be because few methodological studies examine how to best combine these methods. Furthermore, even fewer of the methodological studies incorporate different non-response mechanisms in their analysis. This study examines ...


Time Series Copulas For Heteroskedastic Data, Michael S. Smith, Worapree Maneesoonthorn, Ruben Loaiza-Maya 2017 Melbourne Business School

Time Series Copulas For Heteroskedastic Data, Michael S. Smith, Worapree Maneesoonthorn, Ruben Loaiza-Maya

Michael Stanley Smith

We propose parametric copulas that capture serial dependence in stationary heteroskedastic time series. We develop our copula for first order Markov series, and extend it to higher orders and multivariate series. We derive the copula of a volatility proxy, based on which we propose new measures of volatility dependence, including co-movement and spillover in multivariate series. In general, these depend upon the marginal distributions of the series. Using exchange rate returns, we show that the resulting copula models can capture their marginal distributions more accurately than univariate and multivariate GARCH models, and produce more accurate value at risk forecasts.


Pointwise Influence Matrices For Functional-Response Regression, Philip T. Reiss, Lei Huang, Pei-Shien Wu, Huaihou Chen, Stan Colcombe 2016 New York University School of Medicine

Pointwise Influence Matrices For Functional-Response Regression, Philip T. Reiss, Lei Huang, Pei-Shien Wu, Huaihou Chen, Stan Colcombe

Philip T. Reiss

We extend the notion of an influence or hat matrix to regression with functional responses and scalar predictors. For responses depending linearly on a set of predictors, our definition is shown to reduce to the conventional influence matrix for linear models. The pointwise degrees of freedom, the trace of the pointwise hat matrix, are shown to have an adaptivity property that motivates a two-step bivariate smoother for modeling nonlinear dependence on a single predictor. This procedure adapts to varying complexity of the nonlinear model at different locations along the function, and thereby achieves better performance than competing tensor product smoothers ...


Penalized Nonparametric Scalar-On-Function Regression Via Principal Coordinates, Philip T. Reiss, David L. Miller, Pei-Shien Wu, Wen-Yu Hua 2016 New York University School of Medicine

Penalized Nonparametric Scalar-On-Function Regression Via Principal Coordinates, Philip T. Reiss, David L. Miller, Pei-Shien Wu, Wen-Yu Hua

Philip T. Reiss

A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. The core idea is to regress the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, the proposed ...


Improving Power In Group Sequential, Randomized Trials By Adjusting For Prognostic Baseline Variables And Short-Term Outcomes, Tianchen Qian, Michael Rosenblum, Huitong Qiu 2016 Departmnet of Biostatistics, Johns Hopkins Bloomberg School of Public Health

Improving Power In Group Sequential, Randomized Trials By Adjusting For Prognostic Baseline Variables And Short-Term Outcomes, Tianchen Qian, Michael Rosenblum, Huitong Qiu

Johns Hopkins University, Dept. of Biostatistics Working Papers

In group sequential designs, adjusting for baseline variables and short-term outcomes can lead to increased power and reduced sample size. We derive formulas for the precision gain from such variable adjustment using semiparametric estimators for the average treatment effect, and give new results on what conditions lead to substantial power gains and sample size reductions. The formulas reveal how the impact of prognostic variables on the precision gain is modified by the number of pipeline participants, analysis timing, enrollment rate, and treatment effect heterogeneity, when the semiparametric estimator uses correctly specified models. Given set prognostic value of baseline variables and ...


Stochastic Optimization Of Adaptive Enrichment Designs For Two Subpopulations, Aaron Fisher, Michael Rosenblum 2016 Harvard T.H. Chan School of Public Health

Stochastic Optimization Of Adaptive Enrichment Designs For Two Subpopulations, Aaron Fisher, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

An adaptive enrichment design is a randomized trial that allows enrollment criteria to be modified at interim analyses, based on a preset decision rule. When there is prior uncertainty regarding treatment effect heterogeneity, these trial designs can provide improved power for detecting treatment effects in subpopulations. We present a simulated annealing approach to search over the space of decision rules and other parameters for an adaptive enrichment design. The goal is to minimize the expected number enrolled or expected duration, while preserving the appropriate power and Type I error rate. We also explore the benefits of parallel computation in the ...


Rao-Lovric And The Triwizard Point Null Hypothesis Tournament, Shlomo Sawilowsky 2016 Wayne State University

Rao-Lovric And The Triwizard Point Null Hypothesis Tournament, Shlomo Sawilowsky

Journal of Modern Applied Statistical Methods

The debate if the point null hypothesis is ever literally true cannot be resolved, because there are three competing statistical systems claiming ownership of the construct. The local resolution depends on personal acclimatization to a Fisherian, Frequentist, or Bayesian orientation (or an unexpected fourth champion if decision theory is allowed to compete). Implications of Rao and Lovric’s proposed Hodges-Lehman paradigm are discussed in the Appendix.


Censoring Unbiased Regression Trees And Ensembles, Jon Arni Steingrimsson, Liqun Diao, Robert L. Strawderman 2016 Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health

Censoring Unbiased Regression Trees And Ensembles, Jon Arni Steingrimsson, Liqun Diao, Robert L. Strawderman

Johns Hopkins University, Dept. of Biostatistics Working Papers

This paper proposes a novel approach to building regression trees and ensemble learning in survival analysis. By first extending the theory of censoring unbiased transformations, we construct observed data estimators of full data loss functions in cases where responses can be right censored. This theory is used to construct two specific classes of methods for building regression trees and regression ensembles that respectively make use of Buckley-James and doubly robust estimating equations for a given full data risk function. For the particular case of squared error loss, we further show how to implement these algorithms using existing software (e.g ...


Matching The Efficiency Gains Of The Logistic Regression Estimator While Avoiding Its Interpretability Problems, In Randomized Trials, Michael Rosenblum, Jon Arni Steingrimsson 2016 Johns Hopkins Bloomberg School of Public Health, Department of Biostatistics

Matching The Efficiency Gains Of The Logistic Regression Estimator While Avoiding Its Interpretability Problems, In Randomized Trials, Michael Rosenblum, Jon Arni Steingrimsson

Johns Hopkins University, Dept. of Biostatistics Working Papers

Adjusting for prognostic baseline variables can lead to improved power in randomized trials. For binary outcomes, a logistic regression estimator is commonly used for such adjustment. This has resulted in substantial efficiency gains in practice, e.g., gains equivalent to reducing the required sample size by 20-28% were observed in a recent survey of traumatic brain injury trials. Robinson and Jewell (1991) proved that the logistic regression estimator is guaranteed to have equal or better asymptotic efficiency compared to the unadjusted estimator (which ignores baseline variables). Unfortunately, the logistic regression estimator has the following dangerous vulnerabilities: it is only interpretable ...


A Synthesis Of Current Surveillance Planning Methods For The Sequential Monitoring Of Drug And Vaccine Adverse Effects Using Electronic Health Care Data, Jennifer C. Nelson, Robert Wellman, Onchee Yu, Andrea J. Cook, Judith C. Maro, Rita Ouellet-Hellstrom, Denise Boudreau, James S. Floyd, Susan R. Heckbert, Simone Pinheiro, Marsha Reichman, Azadeh Shoaibi 2016 Group Health Research Institute; University of Washington

A Synthesis Of Current Surveillance Planning Methods For The Sequential Monitoring Of Drug And Vaccine Adverse Effects Using Electronic Health Care Data, Jennifer C. Nelson, Robert Wellman, Onchee Yu, Andrea J. Cook, Judith C. Maro, Rita Ouellet-Hellstrom, Denise Boudreau, James S. Floyd, Susan R. Heckbert, Simone Pinheiro, Marsha Reichman, Azadeh Shoaibi

eGEMs (Generating Evidence & Methods to improve patient outcomes)

Introduction: The large-scale assembly of electronic health care data combined with the use of sequential monitoring has made proactive postmarket drug- and vaccine-safety surveillance possible. Although sequential designs have been used extensively in randomized trials, less attention has been given to methods for applying them in observational electronic health care database settings.

Existing Methods: We review current sequential-surveillance planning methods from randomized trials, and the Vaccine Safety Datalink (VSD) and Mini-Sentinel Pilot projects—two national observational electronic health care database safety monitoring programs.

Future Surveillance Planning: Based on this examination, we suggest three steps for future surveillance planning in health ...


Advances In Portmanteau Diagnostic Tests, Jinkun Xiao 2016 The University of Western Ontario

Advances In Portmanteau Diagnostic Tests, Jinkun Xiao

Electronic Thesis and Dissertation Repository

Portmanteau test serves an important role in model diagnostics for Box-Jenkins Modelling procedures. A large number of Portmanteau test based on the autocorrelation function are proposed for a general purpose goodness-of-fit test. Since the asymptotic distributions for the statistics has a complicated form which makes it hard to obtain the p-value directly, the gamma approximation is introduced to obtain the p-value. But the approximation will inevitably introduce approximation errors and needs a large number of observations to yield a good approximation. To avoid some pitfalls in the approximation, the Lin-Mcleod Test is further proposed to obtain a numeric solution to ...


Digital Commons powered by bepress