Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology

2013

Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 35

Full-Text Articles in Entire DC Network

Simulating Bipartite Networks To Reflect Uncertainty In Local Network Properties, Ravi Goyal, Joseph Blitzstein, Victor De Gruttola Dec 2013

Simulating Bipartite Networks To Reflect Uncertainty In Local Network Properties, Ravi Goyal, Joseph Blitzstein, Victor De Gruttola

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Guide To Testing A Proportion When There May Be Misclassifications, David L. Farnsworth, Jonathan R. Bradley Dec 2013

A Guide To Testing A Proportion When There May Be Misclassifications, David L. Farnsworth, Jonathan R. Bradley

Articles

Ignoring possible misclassifications when testing for a proportion can lead to erroneous decisions. A statistical test is described that incorporates misclassification rates into the analysis. Easily checked safeguards that ensure that the test is appropriate are given. Additionally, the test provides a procedure when the hypothesis stipulates that the proportion is zero. Applications of the test are illustrated with examples which show that it is practical. Comprehensive guidance is supplied for the practitioner.


Polynomially Adjusted Saddlepoint Density Approximations, Susan Zhe Sheng Nov 2013

Polynomially Adjusted Saddlepoint Density Approximations, Susan Zhe Sheng

Electronic Thesis and Dissertation Repository

This thesis aims at obtaining improved bona fide density estimates and approximants by means of adjustments applied to the widely used saddlepoint approximation. Said adjustments are determined by solving systems of equations resulting from a moment-matching argument. A hybrid density approximant that relies on the accuracy of the saddlepoint approximation in the distributional tails is introduced as well. A certain representation of noncentral indefinite quadratic forms leads to an initial approximation whose parameters are evaluated by simultaneously solving four equations involving the cumulants of the target distribution. A saddlepoint approximation to the distribution of quadratic forms is also discussed. By …


Joint Estimation Of Multiple Graphical Models From High Dimensional Time Series, Huitong Qiu, Fang Han, Han Liu, Brian Caffo Nov 2013

Joint Estimation Of Multiple Graphical Models From High Dimensional Time Series, Huitong Qiu, Fang Han, Han Liu, Brian Caffo

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this manuscript the problem of jointly estimating multiple graphical models in high dimensions is considered. It is assumed that the data are collected from n subjects, each of which consists of m non-independent observations. The graphical models of subjects vary, but are assumed to change smoothly corresponding to a measure of the closeness between subjects. A kernel based method for jointly estimating all graphical models is proposed. Theoretically, under a double asymptotic framework, where both (m,n) and the dimension d can increase, the explicit rate of convergence in parameter estimation is provided, thus characterizing the strength one can borrow …


Hierarchical Vector Auto-Regressive Models And Their Applications To Multi-Subject Effective Connectivity, Cristina Gorrostieta, Mark Fiecas, Hernando Ombao, Erin Burke, Steven Cramer Oct 2013

Hierarchical Vector Auto-Regressive Models And Their Applications To Multi-Subject Effective Connectivity, Cristina Gorrostieta, Mark Fiecas, Hernando Ombao, Erin Burke, Steven Cramer

Mark Fiecas

Vector auto-regressive (VAR) models typically form the basis for constructing directed graphical models for investigating connectivity in a brain network with brain regions of interest (ROIs) as nodes. There are limitations in the standard VAR models. The number of parameters in the VAR model increases quadratically with the number of ROIs and linearly with the order of the model and thus due to the large number of parameters, the model could pose serious estimation problems. Moreover, when applied to imaging data, the standard VAR model does not account for variability in the connectivity structure across all subjects. In this paper, …


Adapting Data Adaptive Methods For Small, But High Dimensional Omic Data: Applications To Gwas/Ewas And More, Sara Kherad Pajouh, Alan E. Hubbard, Martyn T. Smith Oct 2013

Adapting Data Adaptive Methods For Small, But High Dimensional Omic Data: Applications To Gwas/Ewas And More, Sara Kherad Pajouh, Alan E. Hubbard, Martyn T. Smith

U.C. Berkeley Division of Biostatistics Working Paper Series

Exploratory analysis of high dimensional "omics" data has received much attention since the explosion of high-throughput technology allows simultaneous screening of tens of thousands of characteristics (genomics, metabolomics, proteomics, adducts, etc., etc.). Part of this trend has been an increase in the dimension of exposure data in studies of environmental exposure and associated biomarkers. Though some of the general approaches, such as GWAS, are transferable, what has received less focus is 1) how to derive estimation of independent associations in the context of many competing causes, without resorting to a misspecified model, and 2) how to derive accurate small-sample inference …


An L-Moment Based Characterization Of The Family Of Dagum Distributions, Mohan D. Pant, Todd C. Headrick Sep 2013

An L-Moment Based Characterization Of The Family Of Dagum Distributions, Mohan D. Pant, Todd C. Headrick

Mohan Dev Pant

This paper introduces a method for simulating univariate and multivariate Dagum distributions through the method of L-moments and L-correlation. A method is developed for characterizing non-normal Dagum distributions with controlled degrees of L-skew, L-kurtosis, and L-correlations. The procedure can be applied in a variety of contexts such as statistical modeling (e.g., income distribution, personal wealth distributions, etc.) and Monte Carlo or simulation studies. Numerical examples are provided to demonstrate that -moment-based Dagum distributions are superior to their conventional moment-based analogs in terms of estimation and distribution fitting. Evaluation of the proposed method also demonstrates that the estimates of L-skew, L-kurtosis, …


Using The Inverse Transform To Specify Contrasts In Regression And Latent Curve Structural Equation Models, Thomas N. Templin Aug 2013

Using The Inverse Transform To Specify Contrasts In Regression And Latent Curve Structural Equation Models, Thomas N. Templin

Nursing Faculty Research Publications

A simple yet general method for specifying contrasts to test hypotheses in regression and latent curve structural equation models is presented. The traditional qualitative variable coding schemes used in multiple regression (e.g., dummy coding) have a more general formulation. Five matrices are involved: The coding scheme, A. The matrix which gives the distribution and ordering of cases, W; WA = X; X is the design matrix. The contrast coefficient matrix C; and C-1 = A. In practice, only C, C-1, and A are necessary because the statistical software generates the design matrix. This method has great generality …


諸外国における最新のデータエディティング事情~混淆正規分布モデルによる多変量外れ値検出法の検証~(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi Aug 2013

諸外国における最新のデータエディティング事情~混淆正規分布モデルによる多変量外れ値検出法の検証~(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi

Masayoshi Takahashi

No abstract provided.


Testing For A Zero Proportion, Jonathan R. Bradley, David L. Farnsworth Aug 2013

Testing For A Zero Proportion, Jonathan R. Bradley, David L. Farnsworth

Articles

Tests for a proportion that may be zero are described. The setting is an environment in which there can be misclassifications or misdiagnoses, giving the possibility of nonzero counts from false positives even though no real examples may exist. Both frequentist and Bayesian tests and analyses are presented, and examples are given.


Bayesian Hierarchical Modeling With 3pno Item Response Models, Yanyan Sheng, Todd Christopher Headrick Jul 2013

Bayesian Hierarchical Modeling With 3pno Item Response Models, Yanyan Sheng, Todd Christopher Headrick

Todd Christopher Headrick

Fully Bayesian estimation has been developed for unidimensional IRT models. In this context, prior distributions can be specified in a hierarchical manner so that item hyperparameters are unknown and yet still have their own priors. This type of hierarchical modeling is useful in terms of the three-parameter IRT model as it reduces the difficulty of specifying model hyperparameters that lead to adequate prior distributions. Further, hierarchical modeling ameliorates the noncovergence problem associated with nonhierarchical models when appropriate prior information is not available. As such, a Fortran subroutine is provided to implement a hierarchical modeling procedure associated with the three-parameter normal …


Fast Covariance Estimation For High-Dimensional Functional Data, Luo Xiao, David Ruppert, Vadim Zipunnikov, Ciprian Crainiceanu Jun 2013

Fast Covariance Estimation For High-Dimensional Functional Data, Luo Xiao, David Ruppert, Vadim Zipunnikov, Ciprian Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

For smoothing covariance functions, we propose two fast algorithms that scale linearly with the number of observations per function. Most available methods and software cannot smooth covariance matrices of dimension J x J with J>500; the recently introduced sandwich smoother is an exception, but it is not adapted to smooth covariance matrices of large dimensions such as J \ge 10,000. Covariance matrices of order J=10,000, and even J=100,000$ are becoming increasingly common, e.g., in 2- and 3-dimensional medical imaging and high-density wearable sensor data. We introduce two new algorithms that can handle very large covariance matrices: 1) FACE: a …


Trial Designs That Simultaneously Optimize The Population Enrolled And The Treatment Allocation Probabilities, Brandon S. Luber, Michael Rosenblum, Antoine Chambaz Jun 2013

Trial Designs That Simultaneously Optimize The Population Enrolled And The Treatment Allocation Probabilities, Brandon S. Luber, Michael Rosenblum, Antoine Chambaz

Johns Hopkins University, Dept. of Biostatistics Working Papers

Standard randomized trials may have lower than desired power when the treatment effect is only strong in certain subpopulations. This may occur, for example, in populations with varying disease severities or when subpopulations carry distinct biomarkers and only those who are biomarker positive respond to treatment. To address such situations, we develop a new trial design that combines two types of preplanned rules for updating how the trial is conducted based on data accrued during the trial. The aim is a design with greater overall power and that can better determine subpopulation specific treatment effects, while maintaining strong control of …


Death Certificate Completion Skills Of Hospital Physicians In A Developing Country, Ahmed Suleman Haque, Kanza Shamim, Najm Hasan Siddiqui, Muhammad Irfan, Javaid Ahmed Khan Jun 2013

Death Certificate Completion Skills Of Hospital Physicians In A Developing Country, Ahmed Suleman Haque, Kanza Shamim, Najm Hasan Siddiqui, Muhammad Irfan, Javaid Ahmed Khan

Section of Pulmonary & Critical Care

Background

Death certificates (DC) can provide valuable health status data regarding disease incidence, prevalence and mortality in a community. It can guide local health policy and help in setting priorities. Incomplete and inaccurate DC data, on the other hand, can significantly impair the precision of a national health information database. In this study we evaluated the accuracy of death certificates at a tertiary care teaching hospital in a Karachi, Pakistan.

Methods

A retrospective study conducted at Aga Khan University Hospital, Karachi, Pakistan for a period of six months. Medical records and death certificates of all patients who died under adult …


Ensemble-Based Methods For Forecasting Census In Hospital Units, Devin C. Koestler, Hernando Ombao, Jesse Bender May 2013

Ensemble-Based Methods For Forecasting Census In Hospital Units, Devin C. Koestler, Hernando Ombao, Jesse Bender

Dartmouth Scholarship

The ability to accurately forecast census counts in hospital departments has considerable implications for hospital resource allocation. In recent years several different methods have been proposed forecasting census counts, however many of these approaches do not use available patient-specific information. In this paper we present an ensemble-based methodology for forecasting the census under a framework that simultaneously incorporates both (i) arrival trends over time and (ii) patient-specific baseline and time-varying information. The proposed model for predicting census has three components, namely: current census count, number of daily arrivals and number of daily departures. To model the number of daily arrivals, …


Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan May 2013

Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This paper describes a targeted maximum likelihood estimator (TMLE) for the parameters of longitudinal static and dynamic marginal structural models. We consider a longitudinal data structure consisting of baseline covariates, time-dependent intervention nodes, intermediate time-dependent covariates, and a possibly time dependent outcome. The intervention nodes at each time point can include a binary treatment as well as a right-censoring indicator. Given a class of dynamic or static interventions, a marginal structural model is used to model the mean of the intervention specific counterfactual outcome as a function of the intervention, time point, and possibly a subset of baseline covariates. Because …


Balancing Score Adjusted Targeted Minimum Loss-Based Estimation, Samuel D. Lendle, Bruce Fireman, Mark J. Van Der Laan May 2013

Balancing Score Adjusted Targeted Minimum Loss-Based Estimation, Samuel D. Lendle, Bruce Fireman, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Adjusting for a balancing score is sufficient for bias reduction when estimating causal effects including the average treatment effect and effect among the treated. Estimators that adjust for the propensity score in a nonparametric way, such as matching on an estimate of the propensity score, can be consistent when the estimated propensity score is not consistent for the true propensity score but converges to some other balancing score. We call this property the balancing score property, and discuss a class of estimators that have this property. We introduce a targeted minimum loss-based estimator (TMLE) for a treatment specific mean with …


Optimal Tests Of Treatment Effects For The Overall Population And Two Subpopulations In Randomized Trials, Using Sparse Linear Programming, Michael Rosenblum, Han Liu, En-Hsu Yen May 2013

Optimal Tests Of Treatment Effects For The Overall Population And Two Subpopulations In Randomized Trials, Using Sparse Linear Programming, Michael Rosenblum, Han Liu, En-Hsu Yen

Johns Hopkins University, Dept. of Biostatistics Working Papers

We propose new, optimal methods for analyzing randomized trials, when it is suspected that treatment effects may differ in two predefined subpopulations. Such sub-populations could be defined by a biomarker or risk factor measured at baseline. The goal is to simultaneously learn which subpopulations benefit from an experimental treatment, while providing strong control of the familywise Type I error rate. We formalize this as a multiple testing problem and show it is computationally infeasible to solve using existing techniques. Our solution involves a novel approach, in which we first transform the original multiple testing problem into a large, sparse linear …


Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon Apr 2013

Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon

Electronic Thesis and Dissertation Repository

A time series often contains various systematic effects such as trends and seasonality. These different components can be determined and separated by decomposition methods. In this thesis, we discuss time series decomposition process using nonparametric regression. A method based on both loess and harmonic regression is suggested and an optimal model selection method is discussed. We then compare the process with seasonal-trend decomposition by loess STL (Cleveland, 1979). While STL works well when that proper parameters are used, the method we introduce is also competitive: it makes parameter choice more automatic and less complex. The decomposition process often requires that …


A New Diagnostic Test For Regression, Yun Shi Apr 2013

A New Diagnostic Test For Regression, Yun Shi

Electronic Thesis and Dissertation Repository

A new diagnostic test for regression and generalized linear models is discussed. The test is based on testing if the residuals are close together in the linear space of one of the covariates are correlated. This is a generalization of the famous problem of spurious correlation in time series regression. A full model building approach for the case of regression was developed in Mahdi (2011, Ph.D. Thesis, Western University, ”Diagnostic Checking, Time Series and Regression”) using an iterative generalized least squares algorithm. Simulation experiments were reported that demonstrate the validity and utility of this approach but no actual applications were …


A Method For Simulating Burr Type Iii And Type Xii Distributions Through L-Moments And L-Correlations, Mohan D. Pant, Todd C. Headrick Mar 2013

A Method For Simulating Burr Type Iii And Type Xii Distributions Through L-Moments And L-Correlations, Mohan D. Pant, Todd C. Headrick

Mohan Dev Pant

This paper derives the Burr Type III and Type XII family of distributions in the contexts of univariate L-moments and the L-correlations. Included is the development of a procedure for specifying nonnormal distributions with controlled degrees of L-skew, L-kurtosis, and L-correlations. The procedure can be applied in a variety of settings such as statistical modeling (e.g., forestry, fracture roughness, life testing, operational risk, etc.) and Monte Carlo or simulation studies. Numerical examples are provided to demonstrate that L-moment-based Burr distributions are superior to their conventional moment-based analogs in terms of estimation and distribution fitting. Evaluation of the proposed procedure also …


A Bayesian Regression Tree Approach To Identify The Effect Of Nanoparticles Properties On Toxicity Profiles, Cecile Low-Kam, Haiyuan Zhang, Zhaoxia Ji, Tian Xia, Jeffrey I. Zinc, Andre Nel, Donatello Telesca Mar 2013

A Bayesian Regression Tree Approach To Identify The Effect Of Nanoparticles Properties On Toxicity Profiles, Cecile Low-Kam, Haiyuan Zhang, Zhaoxia Ji, Tian Xia, Jeffrey I. Zinc, Andre Nel, Donatello Telesca

COBRA Preprint Series

We introduce a Bayesian multiple regression tree model to characterize relationships between physico-chemical properties of nanoparticles and their in-vitro toxicity over multiple doses and times of exposure. Unlike conventional models that rely on data summaries, our model solves the low sample size issue and avoids arbitrary loss of information by combining all measurements from a general exposure experiment across doses, times of exposure, and replicates. The proposed technique integrates Bayesian trees for modeling threshold effects and interactions, and penalized B-splines for dose and time-response surfaces smoothing. The resulting posterior distribution is sampled via a Markov Chain Monte Carlo algorithm. This …


Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi Jan 2013

Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi

Jeffrey S. Morris

Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.

Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.

Design: A single-arm, phase II trial.

Patients: Twenty-seven patients with FAP.

Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.

Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were …


An L-Moment Based Characterization Of The Family Of Dagum Distributions, Mohan D. Pant, Todd C. Headrick Jan 2013

An L-Moment Based Characterization Of The Family Of Dagum Distributions, Mohan D. Pant, Todd C. Headrick

Todd Christopher Headrick

This paper introduces a method for simulating univariate and multivariate Dagum distributions through the method of 𝐿-moments and 𝐿-correlations. A method is developed for characterizing non-normal Dagum distributions with controlled degrees of 𝐿-skew, 𝐿-kurtosis, and 𝐿-correlations. The procedure can be applied in a variety of contexts such as statistical modeling (e.g., income distribution, personal wealth distributions, etc.) and Monte Carlo or simulation studies. Numerical examples are provided to demonstrate that 𝐿-moment-based Dagum distributions are superior to their conventional moment-based analogs in terms of estimation and distribution fitting. Evaluation of the proposed method also demonstrates that the estimates of 𝐿-skew, 𝐿-kurtosis, …


Analysis Of Spatial Data, Xiang Zhang Jan 2013

Analysis Of Spatial Data, Xiang Zhang

Theses and Dissertations--Statistics

In many areas of the agriculture, biological, physical and social sciences, spatial lattice data are becoming increasingly common. In addition, a large amount of lattice data shows not only visible spatial pattern but also temporal pattern (see, Zhu et al. 2005). An interesting problem is to develop a model to systematically model the relationship between the response variable and possible explanatory variable, while accounting for space and time effect simultaneously.

Spatial-temporal linear model and the corresponding likelihood-based statistical inference are important tools for the analysis of spatial-temporal lattice data. We propose a general asymptotic framework for spatial-temporal linear models and …


Income Inequality Measures And Statistical Properties Of Weighted Burr-Type And Related Distributions, Meznah R. Al Buqami Jan 2013

Income Inequality Measures And Statistical Properties Of Weighted Burr-Type And Related Distributions, Meznah R. Al Buqami

Electronic Theses and Dissertations

In this thesis, tail conditional expectation (TCE) in risk analysis, an important measure for right-tail risk, is presented. This value is generally based on the quantile of the loss distribution. Explicit formulas of several tail conditional expectations and inequality measures for Dagum-type models are derived. In addition, a new class of weighted Burr-III (WBIII) distribution is presented. The statistical properties of this distribution including hazard and reverse hazard functions, moments, coefficient of variation, skewness, and kurtosis, inequality measures, entropy are derived. Also, Fisher information and maximum likelihood estimates of the model parameters are obtained.


On The Exact Size Of Multiple Comparison Tests, Chris Lloyd Dec 2012

On The Exact Size Of Multiple Comparison Tests, Chris Lloyd

Chris J. Lloyd

No abstract provided.


Instrumental Variable Analyses: Exploiting Natural Randomness To Understand Causal Mechanisms, Theodore Iwashyna, Edward Kennedy Dec 2012

Instrumental Variable Analyses: Exploiting Natural Randomness To Understand Causal Mechanisms, Theodore Iwashyna, Edward Kennedy

Edward H. Kennedy

Instrumental variable analysis is a technique commonly used in the social sciences to provide evidence that a treatment causes an outcome, as contrasted with evidence that a treatment is merely associated with differences in an outcome. To extract such strong evidence from observational data, instrumental variable analysis exploits situations where some degree of randomness affects how patients are selected for a treatment. An instrumental variable is a characteristic of the world that leads some people to be more likely to get the specific treatment we want to study but does not otherwise change thosepatients’ outcomes. This seminar explains, in nonmathematical …


Theory And Methods For Gini Coefficients Partitioned By Quantile Range, Chaitra Nagaraja Dec 2012

Theory And Methods For Gini Coefficients Partitioned By Quantile Range, Chaitra Nagaraja

Chaitra H Nagaraja

The Gini coefficient is frequently used to measure inequality in populations. However, it is possible that inequality levels may change over time differently for disparate subgroups which cannot be detected with population-level estimates only. Therefore, it may be informative to examine inequality separately for these segments. The case where the population is split into two segments based on non-overlapping quantile ranges is examined. Asymptotic theory is derived and practical methods to estimate standard errors and construct confidence intervals using resampling methods are developed. An application to per capita income across census tracts using American Community Survey data is considered.


A Comparison Of Periodic Autoregressive And Dynamic Factor Models In Intraday Energy Demand Forecasting, Thomas Mestekemper, Goeran Kauermann, Michael Smith Dec 2012

A Comparison Of Periodic Autoregressive And Dynamic Factor Models In Intraday Energy Demand Forecasting, Thomas Mestekemper, Goeran Kauermann, Michael Smith

Michael Stanley Smith

We suggest a new approach for forecasting energy demand at an intraday resolution. Demand in each intraday period is modeled using semiparametric regression smoothing to account for calendar and weather components. Residual serial dependence is captured by one of two multivariate stationary time series models, with dimension equal to the number of intraday periods. These are a periodic autoregression and a dynamic factor model. We show the benefits of our approach in the forecasting of district heating demand in a steam network in Germany and aggregate electricity demand in the state of Victoria, Australia. In both studies, accounting for weather …