Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology Commons

Open Access. Powered by Scholars. Published by Universities.®

Selected Works

2012

Discipline
Institution
Keyword
Publication
File Type

Articles 1 - 30 of 31

Full-Text Articles in Statistical Methodology

Obtaining Critical Values For Test Of Markov Regime Switching, Douglas G. Steigerwald, Valerie Bostwick Oct 2012

Obtaining Critical Values For Test Of Markov Regime Switching, Douglas G. Steigerwald, Valerie Bostwick

Douglas G. Steigerwald

For Markov regime-switching models, testing for the possible presence of more than one regime requires the use of a non-standard test statistic. Carter and Steigerwald (forthcoming, Journal of Econometric Methods) derive in detail the analytic steps needed to implement the test ofMarkov regime-switching proposed by Cho and White (2007, Econometrica). We summarize the implementation steps and address the computational issues that arise. A new command to compute regime-switching critical values, rscv, is introduced and presented in the context of empirical research.


A Doubling Technique For The Power Method Transformations, Mohan D. Pant, Todd C. Headrick Oct 2012

A Doubling Technique For The Power Method Transformations, Mohan D. Pant, Todd C. Headrick

Mohan Dev Pant

Power method polynomials are used for simulating non-normal distributions with specified product moments or L-moments. The power method is capable of producing distributions with extreme values of skew (L-skew) and kurtosis (L-kurtosis). However, these distributions can be extremely peaked and thus not representative of real-world data. To obviate this problem, two families of distributions are introduced based on a doubling technique with symmetric standard normal and logistic power method distributions. The primary focus of the methodology is in the context of L-moment theory. As such, L-moment based systems of equations are derived for simulating univariate and multivariate non-normal distributions with …


An L-Moment-Based Analog For The Schmeiser-Deutsch Class Of Distributions, Todd C. Headrick, Mohan D. Pant Aug 2012

An L-Moment-Based Analog For The Schmeiser-Deutsch Class Of Distributions, Todd C. Headrick, Mohan D. Pant

Mohan Dev Pant

This paper characterizes the conventional moment-based Schmeiser-Deutsch (S-D) class of distributions through the method of L-moments. The system can be used in a variety of settings such as simulation or modeling various processes. A procedure is also described for simulating S-D distributions with specified L-moments and L-correlations. The Monte Carlo results presented in this study indicate that the estimates of L-skew, L-kurtosis, and L-correlation associated with the S-D class of distributions are substantially superior to their corresponding conventional product-moment estimators in terms of relative bias—most notably when sample sizes are small.


諸外国のデータエディティング及び混淆正規分布モデルによる多変量外れ値検出法についての研究(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi Aug 2012

諸外国のデータエディティング及び混淆正規分布モデルによる多変量外れ値検出法についての研究(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi

Masayoshi Takahashi

No abstract provided.


Big Data And The Future, Sherri Rose Jul 2012

Big Data And The Future, Sherri Rose

Sherri Rose

No abstract provided.


Targeted Maximum Likelihood Estimation For Dynamic Treatment Regimes In Sequential Randomized Controlled Trials, Paul Chaffee, Mark J. Van Der Laan Jun 2012

Targeted Maximum Likelihood Estimation For Dynamic Treatment Regimes In Sequential Randomized Controlled Trials, Paul Chaffee, Mark J. Van Der Laan

Paul H. Chaffee

Sequential Randomized Controlled Trials (SRCTs) are rapidly becoming essential tools in the search for optimized treatment regimes in ongoing treatment settings. Analyzing data for multiple time-point treatments with a view toward optimal treatment regimes is of interest in many types of afflictions: HIV infection, Attention Deficit Hyperactivity Disorder in children, leukemia, prostate cancer, renal failure, and many others. Methods for analyzing data from SRCTs exist but they are either inefficient or suffer from the drawbacks of estimating equation methodology. We describe an estimation procedure, targeted maximum likelihood estimation (TMLE), which has been fully developed and implemented in point treatment settings, …


A Logistic L-Moment-Based Analog For The Tukey G-H, G, H, And H-H System Of Distributions, Todd C. Headrick, Mohan D. Pant Jun 2012

A Logistic L-Moment-Based Analog For The Tukey G-H, G, H, And H-H System Of Distributions, Todd C. Headrick, Mohan D. Pant

Mohan Dev Pant

This paper introduces a standard logistic L-moment-based system of distributions. The proposed system is an analog to the standard normal conventional moment-based Tukey g-h, g, h, and h-h system of distributions. The system also consists of four classes of distributions and is referred to as (i) asymmetric γ-κ, (ii) log-logistic γ, (iii) symmetric κ, and (iv) asymmetric κL-κR. The system can be used in a variety of settings such as simulation or modeling events—most notably when heavy-tailed distributions are of interest. A procedure is also described for simulating γ-κ, γ, κ, and κL-κR distributions with specified L-moments and L-correlations. The …


A Method For Simulating Nonnormal Distributions With Specified L-Skew, L-Kurtosis, And L-Correlation, Todd C. Headrick, Mohan D. Pant May 2012

A Method For Simulating Nonnormal Distributions With Specified L-Skew, L-Kurtosis, And L-Correlation, Todd C. Headrick, Mohan D. Pant

Mohan Dev Pant

This paper introduces two families of distributions referred to as the symmetric κ and asymmetric κL-κR distributions. The families are based on transformations of standard logistic pseudo-random deviates. The primary focus of the theoretical development is in the contexts of L-moments and the L-correlation. Also included is the development of a method for specifying distributions with controlled degrees of L-skew, L-kurtosis, and L-correlation. The method can be applied in a variety of settings such as Monte Carlo studies, simulation, or modeling events. It is also demonstrated that estimates of L-skew, L-kurtosis, and L-correlation are superior to conventional product-moment estimates of …


Simulating Non-Normal Distributions With Specified L-Moments And L-Correlations, Todd C. Headrick, Mohan D. Pant May 2012

Simulating Non-Normal Distributions With Specified L-Moments And L-Correlations, Todd C. Headrick, Mohan D. Pant

Mohan Dev Pant

This paper derives a procedure for simulating continuous non-normal distributions with specified L-moments and L-correlations in the context of power method polynomials of order three. It is demonstrated that the proposed procedure has computational advantages over the traditional product-moment procedure in terms of solving for intermediate correlations. Simulation results also demonstrate that the proposed L-moment-based procedure is an attractive alternative to the traditional procedure when distributions with more severe departures from normality are considered. Specifically, estimates of L-skew and L-kurtosis are superior to the conventional estimates of skew and kurtosis in terms of both relative bias and relative standard error. …


Variances For Maximum Penalized Likelihood Estimates Obtained Via The Em Algorithm, Mark Segal, Peter Bacchetti, Nicholas Jewell Apr 2012

Variances For Maximum Penalized Likelihood Estimates Obtained Via The Em Algorithm, Mark Segal, Peter Bacchetti, Nicholas Jewell

Mark R Segal

We address the problem of providing variances for parameter estimates obtained under a penalized likelihood formulation through use of the EM algorithm. The proposed solution represents a synthesis of two existent techniques. Firstly, we exploit the supplemented EM algorithm developed in Meng and Rubin (1991) that provides variance estimates for maximum likelihood estimates obtained via the EM algorithm. Their procedure relies on evaluating the Jacobian of the mapping induced by the EM algorithm. Secondly, we utilize a result from Green (1990) that provides an expression for the Jacobian of the mapping induced by the EM algorithm applied to a penalized …


Backcalculation Of Hiv Infection Rates, Peter Bacchetti, Mark Segal, Nicholas Jewell Apr 2012

Backcalculation Of Hiv Infection Rates, Peter Bacchetti, Mark Segal, Nicholas Jewell

Mark R Segal

Backcalculation is an important method of reconstructing past rates of human immunodeficiency virus (HIV) infection and for estimating current prevalence of HIV infection and future incidence of acquired immunodeficiency syndrome (AIDS). This paper reviews the backcalculation techniques, focusing on the key assumptions of the method, including the necessary information regarding incubation, reporting delay, and models for the infection curve. A summary is given of the extent to which the appropriate external information is available and whether checks of the relevant assumptions are possible through use of data on AIDS incidence from surveillance systems. A likelihood approach to backcalculation is described …


Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway Mar 2012

Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway

Rongheng Lin

Several authors have studied the performance of optimal, squared error loss (SEL) estimated ranks. Though these are effective, in many applications interest focuses on identifying the relatively good (e.g., in the upper 10%) or relatively poor performers. We construct loss functions that address this goal and evaluate candidate rank estimates, some of which optimize specific loss functions. We study performance for a fully parametric hierarchical model with a Gaussian prior and Gaussian sampling distributions, evaluating performance for several loss functions. Results show that though SEL-optimal ranks and percentiles do not specifically focus on classifying with respect to a percentile cut …


On The Order Statistics Of Standard Normal-Based Power Method Distributions, Todd C. Headrick, Mohan D. Pant Mar 2012

On The Order Statistics Of Standard Normal-Based Power Method Distributions, Todd C. Headrick, Mohan D. Pant

Mohan Dev Pant

This paper derives a procedure for determining the expectations of order statistics associated with the standard normal distribution (Z) and its powers of order three and five (Z^3 and Z^5). The procedure is demonstrated for sample sizes of n ≤ 9. It is shown that Z^3 and Z^5 have expectations of order statistics that are functions of the expectations for Z and can be expressed in terms of explicit elementary functions for sample sizes of n ≤ 5. For sample sizes of n = 6, 7 the expectations of the order statistics for Z, Z^3, and Z^5 only require a …


A Doubling Method For The Generalized Lambda Distribution, Todd C. Headrick, Mohan D. Pant Feb 2012

A Doubling Method For The Generalized Lambda Distribution, Todd C. Headrick, Mohan D. Pant

Mohan Dev Pant

This paper introduces a new family of generalized lambda distributions (GLDs) based on a method of doubling symmetric GLDs. The focus of the development is in the context of L-moments and L-correlation theory. As such, included is the development of a procedure for specifying double GLDs with controlled degrees of L-skew, L-kurtosis, and L-correlations. The procedure can be applied in a variety of settings such as modeling events and Monte Carlo or simulation studies. Further, it is demonstrated that estimates of L-skew, L-kurtosis, and L-correlation are substantially superior to conventional product-moment estimates of skew, kurtosis, and Pearson correlation in terms …


Targeted Maximum Likelihood Estimation Of Natural Direct Effects, Wenjing Zheng, Mark Van Der Laan Jan 2012

Targeted Maximum Likelihood Estimation Of Natural Direct Effects, Wenjing Zheng, Mark Van Der Laan

Wenjing Zheng

In many causal inference problems, one is interested in the direct causal effect of an exposure on an outcome of interest that is not mediated by certain intermediate variables. Robins and Greenland (1992) and Pearl (2001) formalized the definition of two types of direct effects (natural and controlled) under the counterfactual framework. The efficient scores (under a nonparametric model) for the various natural effect parameters and their general robustness conditions, as well as an estimating equation based estimator using the efficient score, are provided in Tchetgen Tchetgen and Shpitser (2011b). In this article, we apply the targeted maximum likelihood framework …


Characterizing Tukey H And Hh-Distributions Through L-Moments And The L-Correlation, Todd C. Headrick, Mohan D. Pant Jan 2012

Characterizing Tukey H And Hh-Distributions Through L-Moments And The L-Correlation, Todd C. Headrick, Mohan D. Pant

Mohan Dev Pant

This paper introduces the Tukey family of symmetric h and asymmetric hh-distributions in the contexts of univariate L-moments and the L-correlation. Included is the development of a procedure for specifying nonnormal distributions with controlled degrees of L-skew, L-kurtosis, and L-correlations. The procedure can be applied in a variety of settings such as modeling events (e.g., risk analysis, extreme events) and Monte Carlo or simulation studies. Further, it is demonstrated that estimates of L-skew, L-kurtosis, and L-correlation are substantially superior to conventional product-moment estimates of skew, kurtosis, and Pearson correlation in terms of both relative bias and efficiency when heavy-tailed distributions …


Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris Jan 2012

Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris

Jeffrey S. Morris

In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational …


Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do Jan 2012

Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do

Jeffrey S. Morris

Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.

Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …


R Code: A Non-Iterative Implementation Of Tango's Score Confidence Interval For A Paired Difference Of Proportions, Zhao Yang Jan 2012

R Code: A Non-Iterative Implementation Of Tango's Score Confidence Interval For A Paired Difference Of Proportions, Zhao Yang

Zhao (Tony) Yang, Ph.D.

For matched-pair binary data, a variety of approaches have been proposed for the construction of a confidence interval (CI) for the difference of marginal probabilities between two procedures. The score-based approximate CI has been shown to outperform other asymptotic CIs. Tango’s method provides a score CI by inverting a score test statistic using an iterative procedure. In the developed R code, we propose an efficient non-iterative method with closed-form expression to calculate Tango’s CIs. Examples illustrate the practical application of the new approach.


The Bivariate Rank-Based Concordance Index For Ordinal And Tied Data, Emanuela Raffinetti, Pier Alda Ferrari Jan 2012

The Bivariate Rank-Based Concordance Index For Ordinal And Tied Data, Emanuela Raffinetti, Pier Alda Ferrari

Emanuela Raffinetti

No abstract provided.


Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di Jan 2012

Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di

Chongzhi Di

To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …


Testing For Regime Swtiching: A Comment, Douglas Steigerwald, Andrew Carter Dec 2011

Testing For Regime Swtiching: A Comment, Douglas Steigerwald, Andrew Carter

Douglas G. Steigerwald

An autoregressive model with Markov-regime switching is analyzed that reflects on the properties of the quasi-likelihood ratio test developed by Cho and White (2007). For such a model, we show that consistency of the quasi-maximum likelihood estimator for the population parameter values, on which consistency of the test is based, does not hold. We describe a condition that ensures consistency of the estimator and discuss the consistency of the test in the absence of consistency of the estimator.


Incorporating Network Structure In Integrative Analysis Of Cancer Prognosis Data, Shuangge Ma Dec 2011

Incorporating Network Structure In Integrative Analysis Of Cancer Prognosis Data, Shuangge Ma

Shuangge Ma

In high-throughput cancer genomic studies, markers identified from the analysis of single datasets may have unsatisfactory properties because of low sample sizes. Integrative analysis pools and analyzes raw data from multiple studies, and can effectively increase sample size and lead to improved marker identification results. In this study, we consider the integrative analysis of multiple high-throughput cancer prognosis studies. In the existing integrative analysis studies, the interplay among genes, which can be described using the network structure, has not been effectively accounted for. In network analysis, tightly-connected nodes (genes) are more likely to have related biological functions and similar regression …


Risk Factors Of Follicular Lymphoma, Shuangge Ma Dec 2011

Risk Factors Of Follicular Lymphoma, Shuangge Ma

Shuangge Ma

No abstract provided.


Health Insurance Coverage And Impact: A Survey In Three Cities In China, Shuangge Ma Dec 2011

Health Insurance Coverage And Impact: A Survey In Three Cities In China, Shuangge Ma

Shuangge Ma

No abstract provided.


Integrative Analysis Of Multiple Cancer Genomic Datasets Under The Heterogeneity Model, Shuangge Ma Dec 2011

Integrative Analysis Of Multiple Cancer Genomic Datasets Under The Heterogeneity Model, Shuangge Ma

Shuangge Ma

No abstract provided.


Health Insurance Coverage, Medical Expenditure And Coping Strategy: Evidence From Taiwan, Shuangge Ma Dec 2011

Health Insurance Coverage, Medical Expenditure And Coping Strategy: Evidence From Taiwan, Shuangge Ma

Shuangge Ma

No abstract provided.


Impact Of Illness And Medical Expenditure On Household Consumptions: A Survey In Western China, Shuangge Ma Dec 2011

Impact Of Illness And Medical Expenditure On Household Consumptions: A Survey In Western China, Shuangge Ma

Shuangge Ma

No abstract provided.


Identification Of Gene-Environment Interactions In Cancer Prognosis Studies Using Penalization, Shuangge Ma Dec 2011

Identification Of Gene-Environment Interactions In Cancer Prognosis Studies Using Penalization, Shuangge Ma

Shuangge Ma

High-throughput cancer studies have been extensively conducted, searching for genetic risk factors independently associated with prognosis beyond clinical and environmental risk factors. Many studies have shown that the gene-environment interactions may have important implications. Some of the existing methods, such as the commonly adopted single-marker analysis, may be limited in that they cannot accommodate the joint effects of a large number of genetic markers or use ineffective marker identification techniques. In this study, we analyze cancer prognosis studies, and adopt the AFT (accelerated failure time) model to describe survival. A weighted least squares approach, which has the lowest computational cost, …


Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn Dec 2011

Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn

Michael Stanley Smith

[THIS IS AN AUGUST 2010 REVISION THAT REPLACES ALL PREVIOUS VERSIONS.]

We construct a copula from the skew t distribution of Sahu, Dey & Branco (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discrete-valued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose …