Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 32

Full-Text Articles in Statistics and Probability

Mdc-R-Code 2016 Update, Joseph M. Hilbe Sep 2016

Mdc-R-Code 2016 Update, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data: R code for download and use. Most recent update


Addition To Pglr Chap 6, Joseph M. Hilbe Aug 2016

Addition To Pglr Chap 6, Joseph M. Hilbe

Joseph M Hilbe

Addition to Chapter 6 in Practical Guide to Logistic Regression. Added section on Bayesian logistic regression using Stata.


Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G Kc Chan, C Zheng, Ky Liang Jun 2016

Testing Homogeneity In Semiparametric Mixture Case-Control Models, C Z. Di, G Kc Chan, C Zheng, Ky Liang

Chongzhi Di

Recently, Qin and Liang (Biometrics, 2011) considered a semiparametric mixture case-control model and proposed a score test for homogeneity. The mixture model is semiparametric in the sense that the density ratio of two distributions is assumed to be of exponential form, while the baseline density is unspecified. In a family of parametric admixture models, Di and Liang (Biometrics, 2011) showed that the likelihood ratio test statistics, which is equivalent to a supremum statistics, could improve power over score tests. We generalize the likelihood ratio or supremum statistics to the semiparametric mixture model and demonstrate the power gain over the score …


Hilbe-Pglr-Errata-And-Comments, Joseph M. Hilbe Mar 2016

Hilbe-Pglr-Errata-And-Comments, Joseph M. Hilbe

Joseph M Hilbe

Errata and Comments for Practical Guide to Logistic Regression


Mdc-R-Code, Joseph M. Hilbe Nov 2014

Mdc-R-Code, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data: R code in book provided for use


Mcd - Stata Commands, Joseph M. Hilbe Jul 2014

Mcd - Stata Commands, Joseph M. Hilbe

Joseph M Hilbe

Stata commands and affiliated files for examples in book. Text file explanation of command names is included. 103 files in total


Mcd-Description, Joseph M. Hilbe Jul 2014

Mcd-Description, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data - description of Data Files with examples using R, Stata and SAS


Mcd-Information-, Joseph M. Hilbe Jul 2014

Mcd-Information-, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data - Information about book and resources


Mcd - 11 R Data Files From Book, Joseph M. Hilbe Jul 2014

Mcd - 11 R Data Files From Book, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data: ZIP file with 11 R data files from book


Mcd - 11 Stata Data Files, Joseph M. Hilbe Jul 2014

Mcd - 11 Stata Data Files, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data: 11 Stata files from book


Hilbe-Mcd-Cvs-Data, Joseph M. Hilbe Jul 2014

Hilbe-Mcd-Cvs-Data, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data, data files from book in CVS format


Mcd Information, Joseph M. Hilbe Jul 2014

Mcd Information, Joseph M. Hilbe

Joseph M Hilbe

Information on Modeling Count Data


Mcd Description Data Files: Stata-R-Sas-Excel, Joseph M. Hilbe Jul 2014

Mcd Description Data Files: Stata-R-Sas-Excel, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data: Description of Data Files R, Stata, SAS examples


Mcd-Figures-Code, Joseph M. Hilbe Jul 2014

Mcd-Figures-Code, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data, code for Figures in book - R and Stata


Mdc-Sas-Code, Joseph M. Hilbe Jul 2014

Mdc-Sas-Code, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data, SAS files for download and use


Mcd-Data-Sas, Joseph M. Hilbe Jul 2014

Mcd-Data-Sas, Joseph M. Hilbe

Joseph M Hilbe

Modeling Count Data, 11 SAS data files. SAS users


Interpretation And Prediction Of A Logistic Model, Joseph M. Hilbe Mar 2014

Interpretation And Prediction Of A Logistic Model, Joseph M. Hilbe

Joseph M Hilbe

A basic overview of how to model and interpret a logistic regression model, as well as how to obtain the predicted probability or fit of the model and calculate its confidence intervals. R code used for all examples; some Stata is provided as a contrast.


Sas Macro: Testing Marginal Homogeneity In Clustered Matched-Pair Data, Zhao Yang Jan 2014

Sas Macro: Testing Marginal Homogeneity In Clustered Matched-Pair Data, Zhao Yang

Zhao (Tony) Yang, Ph.D.

The SAS Macro and simulated data example are used to demonstrate the application of tests for marginal homogeneity in clustered matched-pair data.


Sas Macro: Weighted Kappa Statistic For Clustered Matched-Pair Ordinal Data, Zhao Yang Jan 2014

Sas Macro: Weighted Kappa Statistic For Clustered Matched-Pair Ordinal Data, Zhao Yang

Zhao (Tony) Yang, Ph.D.

This SAS macro calculate the weighted kappa statistic and its corresponding non-parametric variance estimator for the clustered matched-pair ordinal data.


Sas Macro: Kappa Statistic For Clustered Physician-Patients Polytomous Data, Zhao Yang Jan 2014

Sas Macro: Kappa Statistic For Clustered Physician-Patients Polytomous Data, Zhao Yang

Zhao (Tony) Yang, Ph.D.

This SAS macro calculate the kappa statistic and its semi-parametric variance estimator for the clustered physician-patients polytomous data. The proposed method depends on the assumption of conditional independence for the clustered physician-patients data structure.


On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang Jan 2014

On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang

Chongzhi Di

In parametric models, when one or more parameters disappear under the null hypothesis, the likelihood ratio test statistic does not converge to chi-square distributions. Rather, its limiting distribution is shown to be equivalent to that of the supremum of a squared Gaussian process. However, the limiting distribution is analytically intractable for most of examples, and approximation or simulation based methods must be used to calculate the p values. In this article, we investigate conditions under which the asymptotic distributions have analytically tractable forms, based on the principal component decomposition of Gaussian processes. When these conditions are not satisfied, the principal …


Sas Macro: Kappa Statistic For Clustered Matched-Pair Data, Zhao Yang Jan 2013

Sas Macro: Kappa Statistic For Clustered Matched-Pair Data, Zhao Yang

Zhao (Tony) Yang, Ph.D.

The SAS macro was developed to calculate the kappa statistic for the clustered matched-pair data.


Generalized Estimating Equations, Second Edition.Pdf, James W. Hardin, Joseph M.. Hilbe Dec 2012

Generalized Estimating Equations, Second Edition.Pdf, James W. Hardin, Joseph M.. Hilbe

Joseph M Hilbe

Generalized Estimating Equations, Second edition, updates the best-selling previous edition, which has been the standard text on the subject since it was published a decade ago. Combining theory and application, the text provides readers with a comprehensive discussion of GEE and related models. Numerous examples are employed throughout the text, along with the software code used to create, run, and evaluate the models being examined. Stata is used as the primary software for running and displaying modeling output; associated R code is also given to allow R users to replicate Stata examples. Specific examples of SAS usage are provided in …


R Code: A Non-Iterative Implementation Of Tango's Score Confidence Interval For A Paired Difference Of Proportions, Zhao Yang Jan 2012

R Code: A Non-Iterative Implementation Of Tango's Score Confidence Interval For A Paired Difference Of Proportions, Zhao Yang

Zhao (Tony) Yang, Ph.D.

For matched-pair binary data, a variety of approaches have been proposed for the construction of a confidence interval (CI) for the difference of marginal probabilities between two procedures. The score-based approximate CI has been shown to outperform other asymptotic CIs. Tango’s method provides a score CI by inverting a score test statistic using an iterative procedure. In the developed R code, we propose an efficient non-iterative method with closed-form expression to calculate Tango’s CIs. Examples illustrate the practical application of the new approach.


The Bivariate Rank-Based Concordance Index For Ordinal And Tied Data, Emanuela Raffinetti, Pier Alda Ferrari Jan 2012

The Bivariate Rank-Based Concordance Index For Ordinal And Tied Data, Emanuela Raffinetti, Pier Alda Ferrari

Emanuela Raffinetti

No abstract provided.


Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di Jan 2012

Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di

Chongzhi Di

To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …


Multilevel Latent Class Models With Dirichlet Mixing Distribution, Chong-Zhi Di, Karen Bandeen-Roche Jan 2011

Multilevel Latent Class Models With Dirichlet Mixing Distribution, Chong-Zhi Di, Karen Bandeen-Roche

Chongzhi Di

Latent class analysis (LCA) and latent class regression (LCR) are widely used for modeling multivariate categorical outcomes in social sciences and biomedical studies. Standard analyses assume data of different respondents to be mutually independent, excluding application of the methods to familial and other designs in which participants are clustered. In this paper, we consider multilevel latent class models, in which sub-population mixing probabilities are treated as random effects that vary among clusters according to a common Dirichlet distribution. We apply the Expectation-Maximization (EM) algorithm for model fitting by maximum likelihood (ML). This approach works well, but is computationally intensive when …


Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang Jan 2011

Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang

Chongzhi Di

We consider likelihood ratio tests (LRT) and their modifications for homogeneity in admixture models. The admixture model is a special case of two component mixture model, where one component is indexed by an unknown parameter while the parameter value for the other component is known. It has been widely used in genetic linkage analysis under heterogeneity, in which the kernel distribution is binomial. For such models, it is long recognized that testing for homogeneity is nonstandard and the LRT statistic does not converge to a conventional 2 distribution. In this paper, we investigate the asymptotic behavior of the LRT for …


Multilevel Functional Principal Component Analysis, Chong-Zhi Di, Ciprian M. Crainiceanu, Brian S. Caffo, Naresh M. Punjabi Jan 2009

Multilevel Functional Principal Component Analysis, Chong-Zhi Di, Ciprian M. Crainiceanu, Brian S. Caffo, Naresh M. Punjabi

Chongzhi Di

The Sleep Heart Health Study (SHHS) is a comprehensive landmark study of sleep and its impacts on health outcomes. A primary metric of the SHHS is the in-home polysomnogram, which includes two electroencephalographic (EEG) channels for each subject, at two visits. The volume and importance of this data presents enormous challenges for analysis. To address these challenges, we introduce multilevel functional principal component analysis (MFPCA), a novel statistical methodology designed to extract core intra- and inter-subject geometric components of multilevel functional data. Though motivated by the SHHS, the proposed methodology is generally applicable, with potential relevance to many modern scientific …


Nonparametric Signal Extraction And Measurement Error In The Analysis Of Electroencephalographic Activity During Sleep, Ciprian M. Crainiceanu, Brian S. Caffo, Chong-Zhi Di, Naresh M. Punjabi Jan 2009

Nonparametric Signal Extraction And Measurement Error In The Analysis Of Electroencephalographic Activity During Sleep, Ciprian M. Crainiceanu, Brian S. Caffo, Chong-Zhi Di, Naresh M. Punjabi

Chongzhi Di

We introduce methods for signal and associated variability estimation based on hierarchical nonparametric smoothing with application to the Sleep Heart Health Study (SHHS). SHHS is the largest electroencephalographic (EEG) collection of sleep-related data, which contains, at each visit, two quasi-continuous EEG signals for each subject. The signal features extracted from EEG data are then used in second level analyses to investigate the relation between health, behavioral, or biometric outcomes and sleep. Using subject specific signals estimated with known variability in a second level regression becomes a nonstandard measurement error problem.We propose and implement methods that take into account cross-sectional and …