Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

SelectedWorks

Keyword
Publication Year
Publication
File Type

Articles 1 - 30 of 495

Full-Text Articles in Physical Sciences and Mathematics

Inventing Around Edison’S Lamp Patent: The Role Of Patents In Stimulating Downstream Development And Competition, Ron D. Katznelson, John Howells Feb 2018

Inventing Around Edison’S Lamp Patent: The Role Of Patents In Stimulating Downstream Development And Competition, Ron D. Katznelson, John Howells

Ron D. Katznelson

We provide the first detailed empirical study of inventing around patent claims. The enforcement of Edison’s incandescent lamp patent in 1891-1894 stimulated a surge of patenting. Most of these later patents disclosed inventions around the Edison patent. Some of these patents introduced important new technology in their own right and became prior art for new fields, indicating that invention around patents contributes to dynamic efficiency. Contrary to widespread contemporary understanding, the Edison lamp patent did not suppress technological advance in electric lighting. The market position of General Electric (“GE”), the Edison patent-owner, weakened through the period of this patent’s enforcement.


Mdc-R-Code 2016 Update, Joseph M. Hilbe Sep 2016

Mdc-R-Code 2016 Update, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data: R code for download and use. Most recent update


Addition To Pglr Chap 6, Joseph M. Hilbe Aug 2016

Addition To Pglr Chap 6, Joseph M. Hilbe

Joseph M Hilbe

Addition to Chapter 6 in Practical Guide to Logistic Regression. Added section on Bayesian logistic regression using Stata.


Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G Kc Chan, C Zheng, Ky Liang Jun 2016

Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G Kc Chan, C Zheng, Ky Liang

Chongzhi Di

Recently, Qin and Liang (Biometrics, 2011) considered a semiparametric mixture case-control model and proposed a score test for homogeneity. The mixture model is semiparametric in the sense that the density ratio of two distributions is assumed to be of exponential form, while the baseline density is unspecified. In a family of parametric admixture models, Di and Liang (Biometrics, 2011) showed that the likelihood ratio test statistics, which is equivalent to a supremum statistics, could improve power over score tests. We generalize the likelihood ratio or supremum statistics to the semiparametric mixture model and demonstrate the power gain over the score …


Hilbe-Pglr-Errata-And-Comments, Joseph M. Hilbe Mar 2016

Hilbe-Pglr-Errata-And-Comments, Joseph M. Hilbe

Joseph M Hilbe

Errata and Comments for Practical Guide to Logistic Regression


Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson Jan 2016

Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson

Jeffrey S. Morris

High-dimensional data with hundreds of thousands of observations are becoming commonplace in many disciplines. The analysis of such data poses many computational challenges, especially when the observations are correlated over time and/or across space. In this paper we propose exible hierarchical regression models for analyzing such data that accommodate serial and/or spatial correlation. We address the computational challenges involved in fitting these models by adopting an approximate inference framework. We develop an online variational Bayes algorithm that works by incrementally reading the data into memory one portion at a time. The performance of the method is assessed through simulation studies. …


Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris Jan 2016

Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris

Jeffrey S. Morris

We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on …


Hypothesis Testing For Functional Linear Models, Y-R Su, Cz Di, L Hsu Jan 2016

Hypothesis Testing For Functional Linear Models, Y-R Su, Cz Di, L Hsu

Chongzhi Di

Functional data arise frequently in many biomedical studies, where it is often of interest to investigate the dynamic association between functional predictors and a scalar response variable. While functional linear models (FLM) are widely used to address these questions, hypothesis testing for the functional association in the FLM framework remains challenging. A popular approach to testing the functional effects is through dimension reduction by functional principal component (PC) analysis. However, its power performance depends on the choice of the number of PCs, and is not systematically studied. In this paper, we first investigate the power performance of the Wald-type test …


Embアルゴリズムの新たな応用による多重比率補定(高橋将宜), Masayoshi Takahashi Sep 2015

Embアルゴリズムの新たな応用による多重比率補定(高橋将宜), Masayoshi Takahashi

Masayoshi Takahashi

No abstract provided.


Pglr-Sas Data, Joseph M. Hilbe Jul 2015

Pglr-Sas Data, Joseph M. Hilbe

Joseph M Hilbe

SAS data files for Practical Guide to Logistic Regression


R Code For Practical Guide To Logistic Regression, Joseph M. Hilbe Jul 2015

R Code For Practical Guide To Logistic Regression, Joseph M. Hilbe

Joseph M Hilbe

R code for Practical Guide to Logistic Regression


Pglr-Stata Data Files, Joseph M. Hilbe Jul 2015

Pglr-Stata Data Files, Joseph M. Hilbe

Joseph M Hilbe

Stata data files for Practical Guide to Logistic Regression


Sas Code Only For Practical Guide To Logistic Regression, Joseph M. Hilbe Jul 2015

Sas Code Only For Practical Guide To Logistic Regression, Joseph M. Hilbe

Joseph M Hilbe

SAS code-only for Practical Guide to Logistic Regression


Sas Code & Output For Practical Guide To Logistic Regression, Joseph M. Hilbe Jul 2015

Sas Code & Output For Practical Guide To Logistic Regression, Joseph M. Hilbe

Joseph M Hilbe

SAS code for Practical Guide to Logistic Regression


Negative Binomial Regerssion, 2nd Ed, 2nd Print, Errata And Comments, Joseph Hilbe Jan 2015

Negative Binomial Regerssion, 2nd Ed, 2nd Print, Errata And Comments, Joseph Hilbe

Joseph M Hilbe

Errata and Comments for 2nd printing of NBR2, 2nd edition. Previous errata from first printing all corrected. Some added and new text as well.


Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris Jan 2015

Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris

Jeffrey S. Morris

Medical and public health research increasingly involves the collection of more and more complex and high dimensional data. In particular, functional data|where the unit of observation is a curve or set of curves that are finely sampled over a grid -- is frequently obtained. Moreover, researchers often sample multiple curves per person resulting in repeated functional measures. A common question is how to analyze the relationship between two functional variables. We propose a general function-on-function regression model for repeatedly sampled functional data, presenting a simple model as well as a more extensive mixed model framework, along with multiple functional posterior …


Functional Regression, Jeffrey S. Morris Jan 2015

Functional Regression, Jeffrey S. Morris

Jeffrey S. Morris

Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and …


Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull Jan 2015

Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull

Jeffrey S. Morris

Current methods for conducting expression Quantitative Trait Loci (eQTL) analysis are limited in scope to a pairwise association testing between a single nucleotide polymorphism (SNPs) and expression probe set in a region around a gene of interest, thus ignoring the inherent between-SNP correlation. To determine association, p-values are then typically adjusted using Plug-in False Discovery Rate. As many SNPs are interrogated in the region and multiple probe-sets taken, the current approach requires the fitting of a large number of models. We propose to remedy this by introducing a flexible function-on-scalar regression that models the genome as a functional outcome. The …


Estimating Controlled Direct Effects Of Restrictive Feeding Practices In The `Early Dieting In Girls' Study, Yeying Zhu, Debashis Ghosh, Donna L. Coffman, Jennifer S. Williams Jan 2015

Estimating Controlled Direct Effects Of Restrictive Feeding Practices In The `Early Dieting In Girls' Study, Yeying Zhu, Debashis Ghosh, Donna L. Coffman, Jennifer S. Williams

Debashis Ghosh

In this article, we examine the causal effect of parental restrictive feeding practices on children’s weight status. An important mediator we are interested in is children’s self-regulation status. Traditional mediation analysis (Baron and Kenny, 1986) applies a structural equation modelling (SEM) approach and decomposes the intent-to-treat (ITT) effect into direct and indirect effects. More recent approaches interpret the mediation effects based on the potential outcomes framework. In practice, there often exist confounders that jointly influence the mediator and the outcome. Inverse probability weighting based on propensity scores are used to adjust for confounding and reduce the dimensionality of confounders simultaneously. …


Equivalence Of Kernel Machine Regression And Kernel Distance Covariance For Multidimensional Trait Association Studies, Wen-Yu Hua, Debashis Ghosh Jan 2015

Equivalence Of Kernel Machine Regression And Kernel Distance Covariance For Multidimensional Trait Association Studies, Wen-Yu Hua, Debashis Ghosh

Debashis Ghosh

Associating genetic markers with a multidimensional phenotype is an important yet challenging problem. In this work, we establish the equivalence between two popular methods: kernel-machine regression (KMR), and kernel distance covariance (KDC). KMR is a semiparametric regression framework that models covariate effects parametrically and genetic markers non-parametrically, while KDC represents a class of methods that include distance covariance (DC) and Hilbert-Schmidt independence criterion (HSIC), which are nonparametric tests of independence. We show that the equivalence between the score test of KMR and the KDC statistic under certain conditions can lead to a novel generalization of the KDC test that incorporates …


A General Approach To Goodness Of Fit For U Processes, Debashis Ghosh, Youngjoo Cho Jan 2015

A General Approach To Goodness Of Fit For U Processes, Debashis Ghosh, Youngjoo Cho

Debashis Ghosh

Goodness of fit procedures are essential tools for assessing model adequacy in statistics. In this work, we present a general theory and approach to goodness of fit techniques based on U-processes for the accelerated failure time (AFT) model. Many of the examples will focus on U-statistics of order 2. While many authors have proposed goodness of fit tests for U-statistics of order one, less has been developed for higher order U-statistics. In this paper, we propose goodness of fit tests for U-statistics of order 2 by using theoretical results from Nolan and Pollard (1987) and Nolan and Pollard (1988). We …


A Boosting Algorithm For Estimating Generalized Propensity Scores With Continuous Treatments, Yeying Zhu, Donna L. Coffman, Debashis Ghosh Jan 2015

A Boosting Algorithm For Estimating Generalized Propensity Scores With Continuous Treatments, Yeying Zhu, Donna L. Coffman, Debashis Ghosh

Debashis Ghosh

In this article, we study the causal inference problem with a continuous treatment variable using propensity score-based methods. For a continuous treatment, the generalized propensity score is defined as the conditional density of the treatment-level given covariates (confounders). The dose–response function is then estimated by inverse probability weighting, where the weights are calculated from the estimated propensity scores. When the dimension of the covariates is large, the traditional nonparametric density estimation suffers from the curse of dimensionality. Some researchers have suggested a two-step estimation procedure by first modeling the mean function. In this study, we suggest a boosting algorithm to …


Testing Hypotheses About Medical Test Accuracy: Considerations For Design And Inference, Adam J. Branscum, Dunlei Cheng, J Jack Lee Jan 2015

Testing Hypotheses About Medical Test Accuracy: Considerations For Design And Inference, Adam J. Branscum, Dunlei Cheng, J Jack Lee

Dunlei Cheng

Developing new medical tests and identifying single biomarkers or panels of biomarkers with superior accuracy over existing classifiers promotes lifelong health of individuals and populations. Before a medical test can be routinely used in clinical practice, its accuracy within diseased and non-diseased populations must be rigorously evaluated. We introduce a method for sample size determination for studies designed to test hypotheses about medical test or biomarker sensitivity and specificity. We show how a sample size can be determined to guard against making type I and/or type II errors by calculating Bayes factors from multiple data sets simulated under null and/or …


The Number Of Subjects Per Variable Required In Linear Regression Analyses, Peter Austin, Ewout Steyerberg Jan 2015

The Number Of Subjects Per Variable Required In Linear Regression Analyses, Peter Austin, Ewout Steyerberg

Peter Austin

Objectives: To determine the number of independent variables that can be included in a linear regression model.

Study Design and Setting: We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R2 of the fitted model.

Results: A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, …


Statistical Power In Parallel Group Point Exposure Studies With Time-To-Event Outcomes: An Empirical Comparison Of The Performance Of Randomized Controlled Trials And The Inverse Probability Of Treatment Weighting (Iptw) Approach, Peter Austin, Tibor Schuster, Robert W. Platt Jan 2015

Statistical Power In Parallel Group Point Exposure Studies With Time-To-Event Outcomes: An Empirical Comparison Of The Performance Of Randomized Controlled Trials And The Inverse Probability Of Treatment Weighting (Iptw) Approach, Peter Austin, Tibor Schuster, Robert W. Platt

Peter Austin

Background: Estimating statistical power is an important component of the design of both randomized controlled trials (RCTs) and observational studies. Methods for estimating statistical power in RCTs have been well described and can be implemented simply. In observational studies, statistical methods must be used to remove the effects of confounding that can occur due to non-random treatment assignment. Inverse probability of treatment weighting (IPTW) using the propensity score is an attractive method for estimating the effects of treatment using observational data. However, sample size and power calculations have not been adequately described for these methods.

Methods: We used an extensive …


Moving Towards Best Practice When Using Inverse Probability Of Treatment Weighting (Iptw) Using The Propensity Score To Estimate Causal Treatment Effects In Observational Studies, Peter Austin, Elizabeth Stuart Jan 2015

Moving Towards Best Practice When Using Inverse Probability Of Treatment Weighting (Iptw) Using The Propensity Score To Estimate Causal Treatment Effects In Observational Studies, Peter Austin, Elizabeth Stuart

Peter Austin

The propensity score is defined as a subject’s probability of treatment selection, conditional on observed baseline covariates.Weighting subjects by the inverse probability of treatment received creates a synthetic sample in which treatment assignment is independent of measured baseline covariates. Inverse probability of treatment weighting (IPTW) using the propensity score allows one to obtain unbiased estimates of average treatment effects. However, these estimates are only valid if there are no residual systematic differences in observed baseline characteristics between treated and control subjects in the sample weighted by the estimated inverse probability of treatment. We report on a systematic literature review, in …


Mdc-R-Code, Joseph M. Hilbe Nov 2014

Mdc-R-Code, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data: R code in book provided for use


Simulating Burr Type Vii Distributions Through The Method Of L-Moments And L-Correlations, Mohan D. Pant, Todd C. Headrick Aug 2014

Simulating Burr Type Vii Distributions Through The Method Of L-Moments And L-Correlations, Mohan D. Pant, Todd C. Headrick

Mohan Dev Pant

Burr Type VII, a one-parameter non-normal distribution, is among the less studied distributions, especially, in the contexts of statistical modeling and simulation studies. The main purpose of this study is to introduce a methodology for simulating univariate and multivariate Burr Type VII distributions through the method of L-moments and L-correlations. The methodology can be applied in statistical modeling of events in a variety of applied mathematical contexts and Monte Carlo simulation studies. Numerical examples are provided to demonstrate that L-moment-based Burr Type VII distributions are superior to their conventional moment-based analogs in terms of distribution fitting and estimation. Simulation results …


Asimmetria Del Rischio Sistematico Dei Titolo Immobiliari Americani: Nuove Evidenze Econometriche, Paola De Santis, Carlo Drago Jul 2014

Asimmetria Del Rischio Sistematico Dei Titolo Immobiliari Americani: Nuove Evidenze Econometriche, Paola De Santis, Carlo Drago

Carlo Drago

In questo lavoro riscontriamo un aumento del rischio sistematico dei titoli del mercato immobiliare americano nell’anno 2007 seguito da un ritorno ai valori iniziali nell’anno 2009 e si evidenzia la possibile presenza di break strutturali. Per valutare il suddetto rischio sistematico è stato scelto il modello a tre fattori di Fama e French ed è stata studiata la relazione tra l’extra rendimento dell’indice REIT, utilizzato come proxy dell’andamento dei titoli immobiliari americani, e l’extra rendimento dell’indice S&P500 rappresentativo del rendimento del portafoglio di mercato. I risultati confermano la presenza di un “Asymmetric REIT Beta Puzzle” coerentemente con alcuni precedenti studi …


Mcd - Stata Commands, Joseph M. Hilbe Jul 2014

Mcd - Stata Commands, Joseph M. Hilbe

Joseph M Hilbe

Stata commands and affiliated files for examples in book. Text file explanation of command names is included. 103 files in total