Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Theory

2006

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 50

Full-Text Articles in Statistics and Probability

A Note On Empirical Likelihood Inference Of Residual Life Regression, Ying Qing Chen, Yichuan Zhao Dec 2006

A Note On Empirical Likelihood Inference Of Residual Life Regression, Ying Qing Chen, Yichuan Zhao

Yichuan Zhao

Mean residual life function, or life expectancy, is an important function to characterize distribution of residual life. The proportional mean residual life model by Oakes and Dasu (1990) is a regression tool to study the association between life expectancy and its associated covariates. Although semiparametric inference procedures have been proposed in the literature, the accuracy of such procedures may be low when the censoring proportion is relatively large. In this paper, the semiparametric inference procedures are studied with an empirical likelihood ratio method. An empirical likelihood confidence region is constructed for the regression parameters. The proposed method is further compared …


Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Properties Of Monotonic Effects, Tyler J. Vanderweele, James M. Robins Nov 2006

Properties Of Monotonic Effects, Tyler J. Vanderweele, James M. Robins

COBRA Preprint Series

Various relationships are shown hold between monotonic effects and weak monotonic effects and the monotonicity of certain conditional expectations. This relationship is considered for both binary and non-binary variables. Counterexamples are provide to show that the results do not hold under less restrictive conditions. The ideas of monotonic effects are furthermore used to relate signed edges on a directed acyclic graph to qualitative effect modification.


Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch Nov 2006

Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch

Harvard University Biostatistics Working Paper Series

An optimal multiple testing procedure is identified for linear hypotheses under the general linear model, maximizing the expected number of false null hypotheses rejected at any significance level. The optimal procedure depends on the unknown data-generating distribution, but can be consistently estimated. Drawing information together across many hypotheses, the estimated optimal procedure provides an empirical alternative hypothesis by adapting to underlying patterns of departure from the null. Proposed multiple testing procedures based on the empirical alternative are evaluated through simulations and an application to gene expression microarray data. Compared to a standard multiple testing procedure, it is not unusual for …


Large Cluster Asymptotics For Gee: Working Correlation Models, Hyoju Chung, Thomas Lumley Oct 2006

Large Cluster Asymptotics For Gee: Working Correlation Models, Hyoju Chung, Thomas Lumley

UW Biostatistics Working Paper Series

This paper presents large cluster asymptotic results for generalized estimating equations. The complexity of working correlation model is characterized in terms of the number of working correlation components to be estimated. When the cluster size is relatively large, we may encounter a situation where a high-dimensional working correlation matrix is modeled and estimated from the data. In the present asymptotic setting, the cluster size and the complexity of working correlation model grow with the number of independent clusters. We show the existence, weak consistency and asymptotic normality of marginal regression parameter estimators using the results of empirical process theory and …


Bayesian Hidden Markov Modeling Of Array Cgh Data, Subharup Guha, Yi Li, Donna Neuberg Oct 2006

Bayesian Hidden Markov Modeling Of Array Cgh Data, Subharup Guha, Yi Li, Donna Neuberg

Harvard University Biostatistics Working Paper Series

Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for …


Targeted Maximum Likelihood Learning, Mark J. Van Der Laan, Daniel Rubin Oct 2006

Targeted Maximum Likelihood Learning, Mark J. Van Der Laan, Daniel Rubin

U.C. Berkeley Division of Biostatistics Working Paper Series

Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one has available an estimate of the density of the data generating distribution such as a maximum likelihood estimator according to a given or data adaptively selected model. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of the density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and …


Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li Sep 2006

Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


Diagnosing Bias In The Inverse Probability Of Treatment Weighted Estimator Resulting From Violation Of Experimental Treatment Assignment, Yue Wang, Maya L. Petersen, David Bangsberg, Mark J. Van Der Laan Sep 2006

Diagnosing Bias In The Inverse Probability Of Treatment Weighted Estimator Resulting From Violation Of Experimental Treatment Assignment, Yue Wang, Maya L. Petersen, David Bangsberg, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Inverse probability of treatment weighting (IPTW) is frequently used to estimate the causal effects of treatments and interventions. The consistency of the IPTW estimator relies not only on the well-recognized assumption of no unmeasured confounders (Sequential Randomization Assumption or SRA), but also on the assumption of experimentation in the assignment of treatment (Experimental Treatment Assignment or ETA). In finite samples, violations in the ETA assumption can occur due simply to chance; certain treatments become rare or non-existent for certain strata of the population. Such practical violations of the ETA assumption occur frequently in real data, and can result in significant …


Extending Marginal Structural Models Through Local, Penalized, And Additive Learning, Daniel Rubin, Mark J. Van Der Laan Sep 2006

Extending Marginal Structural Models Through Local, Penalized, And Additive Learning, Daniel Rubin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Marginal structural models (MSMs) allow one to form causal inferences from data, by specifying a relationship between a treatment and the marginal distribution of a corresponding counterfactual outcome. Following their introduction in Robins (1997), MSMs have typically been fit after assuming a semiparametric model, and then estimating a finite dimensional parameter. van der Laan and Dudoit (2003) proposed to instead view MSM fitting not as a task of semiparametric parameter estimation, but of nonparametric function approximation. They introduced a class of causal effect estimators based on mapping loss functions suitable for the unavailable counterfactual data to those suitable for the …


Statistical Learning Of Origin-Specific Statically Optimal Individualized Treatment Rules, Mark J. Van Der Laan, Maya L. Petersen Sep 2006

Statistical Learning Of Origin-Specific Statically Optimal Individualized Treatment Rules, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider a longitudinal observational or controlled study in which one collects chronological data over time on n randomly sampled subjects. The time-dependent process one observes on each randomly sampled subject contains time-dependent covariates, time-dependent treatment actions, and an outcome process or single final outcome of interest. A statically optimal individualized treatment rule (as introduced in van der Laan, Petersen & Joffe (2005), Petersen & van der Laan (2006)) is a (unknown) treatment rule which at any point in time conditions on a user-supplied subset of the past, computes the future static treatment regimen that maximizes a (conditional) mean future outcome …


Comparing The Statistical Tests For Homogeneity Of Variances., Zhiqiang Mu Aug 2006

Comparing The Statistical Tests For Homogeneity Of Variances., Zhiqiang Mu

Electronic Theses and Dissertations

Testing the homogeneity of variances is an important problem in many applications since statistical methods of frequent use, such as ANOVA, assume equal variances for two or more groups of data. However, testing the equality of variances is a difficult problem due to the fact that many of the tests are not robust against non-normality. It is known that the kurtosis of the distribution of the source data can affect the performance of the tests for variance. We review the classical tests and their latest, more robust modifications, some other tests that have recently appeared in the literature, and use …


Predicting Future Responses Based On Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, L.J. Wei Aug 2006

Predicting Future Responses Based On Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, L.J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield Jul 2006

The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield

UW Biostatistics Working Paper Series

Ecological studies, in which data are available at the level of the group, rather than at the level of the individual, are susceptible to a range of biases due to their inability to characterize within-group variability in exposures and confounders. In order to overcome these biases, we propose a hybrid design in which ecological data are supplemented with a sample of individual-level case-control data. We develop the likelihood for this design and illustrate its benefits via simulation, both in bias reduction when compared to an ecological study, and in efficiency gains relative to a conventional case-control study. An interesting special …


The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield Jul 2006

The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield

UW Biostatistics Working Paper Series

Ecological studies, in which data are available at the level of the group, rather than at the level of the individual, are susceptible to a range of biases due to their inability to characterize within-group variability in exposures and confounders. In order to overcome these biases, we propose a hybrid design in which ecological data are supplemented with a sample of individual-level case-control data. We develop the likelihood for this design and illustrate its benefits via simulation, both in bias reduction when compared to an ecological study, and in efficiency gains relative to a conventional case-control study. An interesting special …


Doubly Robust Censoring Unbiased Transformations, Daniel Rubin, Mark J. Van Der Laan Jun 2006

Doubly Robust Censoring Unbiased Transformations, Daniel Rubin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We consider random design nonparametric regression when the response variable is subject to right censoring. Following the work of Fan and Gijbels (1994), a common approach to this problem is to apply what has been termed a censoring unbiased transformation to the data to obtain surrogate responses, and then enter these surrogate responses with covariate data into standard smoothing algorithms. Existing censoring unbiased transformations generally depend on either the conditional survival function of the response of interest, or that of the censoring variable. We show that a mapping introduced in another statistical context is in fact a censoring unbiased transformation …


A Method To Increase The Power Of Multiple Testing Procedures Through Sample Splitting, Daniel Rubin, Sandrine Dudoit, Mark J. Van Der Laan Jun 2006

A Method To Increase The Power Of Multiple Testing Procedures Through Sample Splitting, Daniel Rubin, Sandrine Dudoit, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis is associated with a test statistic, and large test statistics provide evidence against the null hypotheses. One proposal to provide probabilistic control of Type-I errors is the use of procedures ensuring that the expected number of false positives does not exceed a user-supplied threshold. Among such multiple testing procedures, we derive the ``most powerful'' method, meaning the test statistic cutoffs that maximize the expected number of true positives. Unfortunately, these optimal cutoffs depend on the true unknown data generating distribution, so could never be used …


Bayesian Reference Inference On The Ratio Of Poisson Rates., Changbin Guo May 2006

Bayesian Reference Inference On The Ratio Of Poisson Rates., Changbin Guo

Electronic Theses and Dissertations

Bayesian reference analysis is a method of determining the prior under the Bayesian paradigm. It incorporates as little information as possible from the experiment. Estimation of the ratio of two independent Poisson rates is a common practical problem. In this thesis, the method of reference analysis is applied to derive the posterior distribution of the ratio of two independent Poisson rates, and then to construct point and interval estimates based on the reference posterior. In addition, the Frequentist coverage property of HPD intervals is verified through simulation.


Ancova: A Robust Omnibus Test Based On Selected Design Points, Rand R. Wilcox May 2006

Ancova: A Robust Omnibus Test Based On Selected Design Points, Rand R. Wilcox

Journal of Modern Applied Statistical Methods

Many robust analogs of the classic analysis of covariance method have been proposed. One approach, when comparing two independent groups, uses selected design points and then compares the groups at each design point using some robust method for comparing measures of location. So, if K design points are of interest, K tests are performed. There are rather obvious ways of performing, instead, an omnibus test that for all K points, no differences between the groups exist. One of the main results here is that several variations of these methods can perform very poorly in simulations. An alternative approach, based in …


The Effect On Type I Error And Power Of Various Methods Of Resolving Ties For Six Distribution-Free Tests Of Location, Bruce R. Fay May 2006

The Effect On Type I Error And Power Of Various Methods Of Resolving Ties For Six Distribution-Free Tests Of Location, Bruce R. Fay

Journal of Modern Applied Statistical Methods

The impact on Type I error robustness and power for nine different methods of resolving ties was assessed for six distribution-free statistics with four empirical data sets using Monte Carlo techniques. These statistics share an underlying assumption of population continuity such that samples are assumed to have no equal data values (no zero difference–scores, no tied ranks). The best results across all tests and combinations of simulation parameters were obtained by randomly resolving ties, although there were exceptions. The method of dropping ties and reducing the sample size performed poorly.


Limitations Of The Analysis Of Variance, Phillip I. Good, Clifford E. Lunneborg May 2006

Limitations Of The Analysis Of Variance, Phillip I. Good, Clifford E. Lunneborg

Journal of Modern Applied Statistical Methods

Conditions under which the analysis of variance will yield inexact p-values or would be inferior in power to a permutation test are investigated. The findings for the one-way design are consistent with and extend those of Miller (1980).


Multiple Comparison Procedures, Trimmed Means And Transformed Statistics, Rhonda K. Kowalchuk, H. J. Keselman, Rand R. Wilcox, James Algina, James Algina, James Algina May 2006

Multiple Comparison Procedures, Trimmed Means And Transformed Statistics, Rhonda K. Kowalchuk, H. J. Keselman, Rand R. Wilcox, James Algina, James Algina, James Algina

Journal of Modern Applied Statistical Methods

A modification to testing pairwise comparisons that may provide better control of Type I errors in the presence of non-normality is to use a preliminary test for symmetry which determines whether data should be trimmed symmetrically or asymmetrically. Several pairwise MCPs were investigated, employing a test of symmetry with a number of heteroscedastic test statistics that used trimmed means and Winsorized variances. Results showed improved Type I error control than competing robust statistics.


Understanding Eurasian Convergence: Application Of Kohonen Self-Organizing Maps, Joel I. Deichmann, Abdolreza Eshghi, Dominique Haughton, Selin Sayek, Nicholas Teebagy, Heikki Topi May 2006

Understanding Eurasian Convergence: Application Of Kohonen Self-Organizing Maps, Joel I. Deichmann, Abdolreza Eshghi, Dominique Haughton, Selin Sayek, Nicholas Teebagy, Heikki Topi

Journal of Modern Applied Statistical Methods

Kohonen self-organizing maps (SOMs) are employed to examine economic and social convergence of Eurasian countries based on a set of twenty-eight socio-economic measures. A core of European Union states is identified that provides a benchmark against which convergence of post-socialist transition economies may be judged. The Central European Visegrád countries and Baltics show the greatest economic convergence to Western Europe, while other states form clusters that lag behind. Initial conditions on the social dimension can either facilitate or constrain economic convergence, as discovered in Central Europe vis-à-vis the Central Asian Republics. Disquiet in the convergence literature is resolved by providing …


Entropy Criterion In Logistic Regression And Shapley Value Of Predictors, Stan Lipovetsky May 2006

Entropy Criterion In Logistic Regression And Shapley Value Of Predictors, Stan Lipovetsky

Journal of Modern Applied Statistical Methods

Entropy criterion is used for constructing a binary response regression model with a logistic link. This approach yields a logistic model with coefficients proportional to the coefficients of linear regression. Based on this property, the Shapley value estimation of predictors’ contribution is applied for obtaining robust coefficients of the linear aggregate adjusted to the logistic model. This procedure produces a logistic regression with interpretable coefficients robust to multicollinearity. Numerical results demonstrate theoretical and practical advantages of the entropy-logistic regression.


Choosing Smoothing Parameters For Exponential Smoothing: Minimizing Sums Of Squared Versus Sums Of Absolute Errors, Terry E. Dielman May 2006

Choosing Smoothing Parameters For Exponential Smoothing: Minimizing Sums Of Squared Versus Sums Of Absolute Errors, Terry E. Dielman

Journal of Modern Applied Statistical Methods

When choosing smoothing parameters in exponential smoothing, the choice can be made by either minimizing the sum of squared one-step-ahead forecast errors or minimizing the sum of the absolute onestep- ahead forecast errors. In this article, the resulting forecast accuracy is used to compare these two options.


Penalized Splines For Longitudinal Data With An Application In Aids Studies, Hua Liang, Yuanhui Xiao May 2006

Penalized Splines For Longitudinal Data With An Application In Aids Studies, Hua Liang, Yuanhui Xiao

Journal of Modern Applied Statistical Methods

A penalized spline approximation is proposed in considering nonparametric regression for longitudinal data. Standard linear mixed-effects modeling can be applied for the estimation. It is relatively simple, efficiently computed, and robust to the smooth parameters selection, which are often encountered when local polynomial and smoothing spline techniques are used to analyze longitudinal data set. The method is extended to time-varying coefficient mixed-effects models. The proposed methods are applied to data from an AIDS clinical study. Biological interpretations and clinical implications are discussed. Simulation studies are done to illustrate the proposed methods.


Analysis Of Type-Ii Progressively Hybrid Censored Competing Risks Data, Debasis Kundu, Avijit Joarder May 2006

Analysis Of Type-Ii Progressively Hybrid Censored Competing Risks Data, Debasis Kundu, Avijit Joarder

Journal of Modern Applied Statistical Methods

A Type-II progressively hybrid censoring scheme for competing risks data is introduced, where the experiment terminates at a pre-specified time. The likelihood inference of the unknown parameters is derived under the assumptions that the lifetime distributions of the different causes are independent and exponentially distributed. The maximum likelihood estimators of the unknown parameters are obtained in exact forms. Asymptotic confidence intervals and two bootstrap confidence intervals are also proposed. Bayes estimates and credible intervals of the unknown parameters are obtained under the assumption of gamma priors on the unknown parameters. Different methods have been compared using Monte Carlo simulations. One …


The Efficiency Of Ols In The Presence Of Auto-Correlated Disturbances In Regression Models, Samir Safi, Alexander White May 2006

The Efficiency Of Ols In The Presence Of Auto-Correlated Disturbances In Regression Models, Samir Safi, Alexander White

Journal of Modern Applied Statistical Methods

The ordinary least squares (OLS) estimates in the regression model are efficient when the disturbances have mean zero, constant variance, and are uncorrelated. In problems concerning time series, it is often the case that the disturbances are correlated. Using computer simulations, the robustness of various estimators are considered, including estimated generalized least squares. It was found that if the disturbance structure is autoregressive and the dependent variable is nonstochastic and linear or quadratic, the OLS performs nearly as well as its competitors. For other forms of the dependent variable, rules of thumb are presented to guide practitioners in the choice …


Comparison Of Some Simple Estimators Of The Lognormal Parameters Based On Censored Samples, Baklizi Ayman, Mohammed Al-Haj Ebrahem May 2006

Comparison Of Some Simple Estimators Of The Lognormal Parameters Based On Censored Samples, Baklizi Ayman, Mohammed Al-Haj Ebrahem

Journal of Modern Applied Statistical Methods

Point estimation of the parameters of the lognormal distribution with censored data is considered. The often employed maximum likelihood estimator does not exist in closed form and iterative methods that require very good starting points are needed. In this article, some techniques of finding closed form estimators to this situation are presented and extended. An extensive simulation study is carried out to investigate and compare the performance of these techniques. The results show that some of them are highly efficient as compared with the maximum likelihood estimator.


Two New Unbiased Point Estimates Of A Population Variance, Matthew E. Elam May 2006

Two New Unbiased Point Estimates Of A Population Variance, Matthew E. Elam

Journal of Modern Applied Statistical Methods

Two new unbiased point estimates of an unknown population variance are introduced. They are compared to three known estimates using the mean-square error (MSE). A computer program, which is available for download at http://program.20m.com, is developed for performing calculations for the estimates.