Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 60

Full-Text Articles in Statistical Models

Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh Jun 2013

Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in estimation-sample (one of the V subsamples) and corresponding complementary parameter-generating sample that is used to generate a target parameter. For each of the V parameter-generating samples, we apply an algorithm that maps the sample in a target parameter mapping which represent the statistical target parameter generated by that parameter-generating …


Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan May 2013

Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This paper describes a targeted maximum likelihood estimator (TMLE) for the parameters of longitudinal static and dynamic marginal structural models. We consider a longitudinal data structure consisting of baseline covariates, time-dependent intervention nodes, intermediate time-dependent covariates, and a possibly time dependent outcome. The intervention nodes at each time point can include a binary treatment as well as a right-censoring indicator. Given a class of dynamic or static interventions, a marginal structural model is used to model the mean of the intervention specific counterfactual outcome as a function of the intervention, time point, and possibly a subset of baseline covariates. Because …


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan May 2013

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan Mar 2011

Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Asbestos has been known for many years as a powerful carcinogen. Our purpose is quantify the relationship between an occupational exposure to asbestos and an increase of the risk of lung cancer. Furthermore, we wish to tackle the very delicate question of the evaluation, in subjects suffering from a lung cancer, of how much the amount of exposure to asbestos explains the occurrence of the cancer. For this purpose, we rely on a recent French case-control study. We build a large collection of threshold regression models, data-adaptively select a better model in it by multi-fold likelihood-based cross-validation, then fit the …


Causal Inference In Epidemiological Studies With Strong Confounding, Kelly L. Moore, Romain S. Neugebauer, Mark J. Van Der Laan, Ira B. Tager Oct 2009

Causal Inference In Epidemiological Studies With Strong Confounding, Kelly L. Moore, Romain S. Neugebauer, Mark J. Van Der Laan, Ira B. Tager

U.C. Berkeley Division of Biostatistics Working Paper Series

One of the identifiabilty assumptions of causal effects defined by marginal structural model (MSM) parameters is the experimental treatment assignment (ETA) assumption. Practical violations of this assumption frequently occur in data analysis, when certain exposures are rarely observed within some strata of the population. The inverse probability of treatment weighted (IPTW) estimator is particularly sensitive to violations of this assumption, however, we demonstrate that this is a problem for all estimators of causal effects. This is due to the fact that the ETA assumption is about information (or lack thereof) in the data. A new class of causal models, causal …


Confidence Intervals For Negative Binomial Random Variables Of High Dispersion, David Shilane, Alan E. Hubbard, S N. Evans Aug 2008

Confidence Intervals For Negative Binomial Random Variables Of High Dispersion, David Shilane, Alan E. Hubbard, S N. Evans

U.C. Berkeley Division of Biostatistics Working Paper Series

This paper considers the problem of constructing confidence intervals for the mean of a Negative Binomial random variable based upon sampled data. When the sample size is large, we traditionally rely upon a Normal distribution approximation to construct these intervals. However, we demonstrate that the sample mean of highly dispersed Negative Binomials exhibits a slow convergence to the Normal in distribution as a function of the sample size. As a result, standard techniques (such as the Normal approximation and bootstrap) that construct confidence intervals for the mean will typically be too narrow and significantly undercover in the case of high …


Supervised Distance Matrices: Theory And Applications To Genomics, Katherine S. Pollard, Mark J. Van Der Laan Jun 2008

Supervised Distance Matrices: Theory And Applications To Genomics, Katherine S. Pollard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a new approach to studying the relationship between a very high dimensional random variable and an outcome. Our method is based on a novel concept, the supervised distance matrix, which quantifies pairwise similarity between variables based on their association with the outcome. A supervised distance matrix is derived in two stages. The first stage involves a transformation based on a particular model for association. In particular, one might regress the outcome on each variable and then use the residuals or the influence curve from each regression as a data transformation. In the second stage, a choice of distance …


Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan Jun 2008

Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes.

We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability …


Using Regression Models To Analyze Randomized Trials: Asymptotically Valid Hypothesis Tests Despite Incorrectly Specified Models, Michael Rosenblum, Mark J. Van Der Laan Jan 2008

Using Regression Models To Analyze Randomized Trials: Asymptotically Valid Hypothesis Tests Despite Incorrectly Specified Models, Michael Rosenblum, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Regression models are often used to test for cause-effect relationships from data collected in randomized trials or experiments. This practice has deservedly come under heavy scrutiny, since commonly used models such as linear and logistic regression will often not capture the actual relationships between variables, and incorrectly specified models potentially lead to incorrect conclusions. In this paper, we focus on hypothesis test of whether the treatment given in a randomized trial has any effect on the mean of the primary outcome, within strata of baseline variables such as age, sex, and health status. Our primary concern is ensuring that such …


Super Learner, Mark J. Van Der Laan, Eric C. Polley, Alan E. Hubbard Jul 2007

Super Learner, Mark J. Van Der Laan, Eric C. Polley, Alan E. Hubbard

U.C. Berkeley Division of Biostatistics Working Paper Series

Previous articles (van der Laan and Dudoit (2003); van der Laan et al. (2006); Sinisi et al. (2007)) advertised and theoretically validated the use of cross-validation to select among many candidate estimators to compute a so called super learner which outperforms any of the given candidate estimators. The theoretical basis was provided for this super learner based on oracle results for the cross-validation selector (e.g., van der Laan and Dudoit (2003); van der Laan et al. (2006)) and in Sinisi et al. (2007). In addition, these papers contained a practical demonstration of the adaptivity of this so called super learner …


Super Learning: An Application To Prediction Of Hiv-1 Drug Susceptibility, Sandra E. Sinisi, Maya L. Petersen, Mark J. Van Der Laan Apr 2006

Super Learning: An Application To Prediction Of Hiv-1 Drug Susceptibility, Sandra E. Sinisi, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many statistical methods exist that can be used to learn a predictor based on observed data. Examples include decision trees, neural networks, support vector regression, least angle regression, Logic Regression, and the Deletion/Substitution/Addition algorithm. The optimal algorithm for prediction will vary depending on the underlying data-generating distribution. In this article, we introduce a "super learner," a prediction algorithm that applies any set of candidate learners and uses cross-validation to select among them. Theory shows that asymptotically the super learner performs essentially as well or better than any of the candidate learners. We briefly present the theory behind the super learner, …


Causal Effect Models For Intention To Treat And Realistic Individualized Treatment Rules, Mark J. Van Der Laan Mar 2006

Causal Effect Models For Intention To Treat And Realistic Individualized Treatment Rules, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

An important class of models in causal inference are the so-called marginal structural models which model the comparison between counterfactual outcome distributions corresponding with a static treatment intervention, conditional on user supplied baseline covariates, based on observing a longitudinal data structure on a sample of n independent and identically distributed experimental units. Identification of a static treatment regimen specific outcome distribution based on observational data requires beyond the so-called sequential randomization assumption that each experimental unit has positive probability of following the static treatment regimen. The latter assumption is called the experimental treatment assignment assumption (ETA) (which is parameter specific). …


Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan Mar 2006

Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a general and formal statistical framework for the multiple tests of associations between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known fixed gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating genome-wide transcript levels or DNA copy numbers to possibly censored biological and …


A Fine-Scale Linkage Disequilibrium Measure Based On Length Of Haplotype Sharing, Yan Wang, Lue Ping Zhao, Sandrine Dudoit Oct 2005

A Fine-Scale Linkage Disequilibrium Measure Based On Length Of Haplotype Sharing, Yan Wang, Lue Ping Zhao, Sandrine Dudoit

U.C. Berkeley Division of Biostatistics Working Paper Series

High-throughput genotyping technologies for single nucleotide polymorphisms (SNP) have enabled the recent completion of the International HapMap Project (Phase I), which has stimulated much interest in studying genome-wide linkage disequilibrium (LD) patterns. Conventional LD measures, such as D' and r-square, are two-point measurements, and their relationship with physical distance is highly noisy. We propose a new LD measure, defined in terms of the correlation coefficient for shared haplotype lengths around two loci, thereby borrowing information from multiple loci. A U-statistic-based estimator of the new LD measure, which takes into consideration the dependence structure of the observed data, is developed and …


Population Intervention Models In Causal Inference, Alan E. Hubbard, Mark J. Van Der Laan Oct 2005

Population Intervention Models In Causal Inference, Alan E. Hubbard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a] treatment variable or risk variable on the distribution of a disease in a population. These models, as originally introduced by Robins (e.g., Robins (2000a), Robins (2000b), van der Laan and Robins (2002)), model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates, and its dependence on treatment. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at …


Cross-Validated Bagged Prediction Of Survival, Sandra E. Sinisi, Romain Neugebauer, Mark J. Van Der Laan Sep 2005

Cross-Validated Bagged Prediction Of Survival, Sandra E. Sinisi, Romain Neugebauer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this article, we show how to apply our previously proposed Deletion/Substitution/Addition algorithm in the context of right-censoring for the prediction of survival. Furthermore, we introduce how to incorporate bagging into the algorithm to obtain a cross-validated bagged estimator. The method is used for predicting the survival time of patients with diffuse large B-cell lymphoma based on gene expression variables.


Direct Effect Models, Mark J. Van Der Laan, Maya L. Petersen Aug 2005

Direct Effect Models, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

The causal effect of a treatment on an outcome is generally mediated by several intermediate variables. Estimation of the component of the causal effect of a treatment that is mediated by a given intermediate variable (the indirect effect of the treatment), and the component that is not mediated by that intermediate variable (the direct effect of the treatment) is often relevant to mechanistic understanding and to the design of clinical and public health interventions. Under the assumption of no-unmeasured confounders for treatment and the intermediate variable, Robins & Greenland (1992) define an individual direct effect as the counterfactual effect of …


Survival Point Estimate Prediction In Matched And Non-Matched Case-Control Subsample Designed Studies, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore, Karla Kerlikowske Aug 2005

Survival Point Estimate Prediction In Matched And Non-Matched Case-Control Subsample Designed Studies, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore, Karla Kerlikowske

U.C. Berkeley Division of Biostatistics Working Paper Series

Providing information about the risk of disease and clinical factors that may increase or decrease a patient's risk of disease is standard medical practice. Although case-control studies can provide evidence of strong associations between diseases and risk factors, clinicians need to be able to communicate to patients the age-specific risks of disease over a defined time interval for a set of risk factors.

An estimate of absolute risk cannot be determined from case-control studies because cases are generally chosen from a population whose size is not known (necessary for calculation of absolute risk) and where duration of follow-up is not …


Application Of A Multiple Testing Procedure Controlling The Proportion Of False Positives To Protein And Bacterial Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan Aug 2005

Application Of A Multiple Testing Procedure Controlling The Proportion Of False Positives To Protein And Bacterial Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Simultaneously testing multiple hypotheses is important in high-dimensional biological studies. In these situations, one is often interested in controlling the Type-I error rate, such as the proportion of false positives to total rejections (TPPFP) at a specific level, alpha. This article will present an application of the E-Bayes/Bootstrap TPPFP procedure, presented in van der Laan et al. (2005), which controls the tail probability of the proportion of false positives (TPPFP), on two biological datasets. The two data applications include firstly, the application to a mass-spectrometry dataset of two leukemia subtypes, AML and ALL. The protein data measurements include intensity and …


Cross-Validating And Bagging Partitioning Algorithms With Variable Importance, Annette M. Molinaro, Mark J. Van Der Laan Aug 2005

Cross-Validating And Bagging Partitioning Algorithms With Variable Importance, Annette M. Molinaro, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We present a cross-validated bagging scheme in the context of partitioning algorithms. To explore the benefits of the various bagging scheme, we compare via simulations the predictive ability of single Classification and Regression (CART) Tree with several previously suggested bagging schemes and with our proposed approach. Additionally, a variable importance measure is explained and illustrated.


Test Statistics Null Distributions In Multiple Testing: Simulation Studies And Applications To Genomics, Katherine S. Pollard, Merrill D. Birkner, Mark J. Van Der Laan, Sandrine Dudoit Jul 2005

Test Statistics Null Distributions In Multiple Testing: Simulation Studies And Applications To Genomics, Katherine S. Pollard, Merrill D. Birkner, Mark J. Van Der Laan, Sandrine Dudoit

U.C. Berkeley Division of Biostatistics Working Paper Series

Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying differentially expressed or co-expressed genes in microarray experiments. We have developed generally applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for control of a broad class of Type I error rates, defined as tail probabilities and expected values for arbitrary functions of the numbers of false positives and rejected hypotheses (Dudoit and van der Laan, 2005; Dudoit et al., 2004a,b; Pollard and van der Laan, 2004; van der Laan et al., 2005, 2004a,b). As argued in the early article of Pollard and van der …


Cross-Validated Bagged Learning, Mark J. Van Der Laan, Sandra E. Sinisi, Maya L. Petersen Jun 2005

Cross-Validated Bagged Learning, Mark J. Van Der Laan, Sandra E. Sinisi, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

Many applications aim to learn a high dimensional parameter of a data generating distribution based on a sample of independent and identically distributed observations. For example, the goal might be to estimate the conditional mean of an outcome given a list of input variables. In this prediction context, Breiman (1996a) introduced bootstrap aggregating (bagging) as a method to reduce the variance of a given estimator at little cost to bias. Bagging involves applying the estimator to multiple bootstrap samples, and averaging the result across bootstrap samples. In order to deal with the curse of dimensionality, typical practice has been to …


Causal Inference In Longitudinal Studies With History-Restricted Marginal Structural Models, Romain Neugebauer, Mark J. Van Der Laan, Ira B. Tager Apr 2005

Causal Inference In Longitudinal Studies With History-Restricted Marginal Structural Models, Romain Neugebauer, Mark J. Van Der Laan, Ira B. Tager

U.C. Berkeley Division of Biostatistics Working Paper Series

Causal Inference based on Marginal Structural Models (MSMs) is particularly attractive to subject-matter investigators because MSM parameters provide explicit representations of causal effects. We introduce History-Restricted Marginal Structural Models (HRMSMs) for longitudinal data for the purpose of defining causal parameters which may often be better suited for Public Health research. This new class of MSMs allows investigators to analyze the causal effect of a treatment on an outcome based on a fixed, shorter and user-specified history of exposure compared to MSMs. By default, the latter represents the treatment causal effect of interest based on a treatment history defined by the …


Survival Ensembles, Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, Annette M. Molinaro, Mark J. Van Der Laan Apr 2005

Survival Ensembles, Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, Annette M. Molinaro, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a unified and flexible framework for ensemble learning in the presence of censoring. For right-censored data, we introduce a random forest algorithm and a generic gradient boosting algorithm for the construction of prognostic models. The methodology is utilized for predicting the survival time of patients suffering from acute myeloid leukemia based on clinical and genetic covariates. Furthermore, we compare the diagnostic capabilities of the proposed censored data random forest and boosting methods applied to the recurrence free survival time of node positive breast cancer patients with previously published findings.


Gllamm Manual, Sophia Rabe-Hesketh, Anders Skrondal, Andrew Pickles Oct 2004

Gllamm Manual, Sophia Rabe-Hesketh, Anders Skrondal, Andrew Pickles

U.C. Berkeley Division of Biostatistics Working Paper Series

This manual describes a Stata program gllamm that can estimate Generalized Linear Latent and Mixed Models (GLLAMMs). GLLAMMs are a class of multilevel latent variable models for (multivariate) responses of mixed type including continuous responses, counts, duration/survival data, dichotomous, ordered and unordered categorical responses and rankings. The latent variables (common factors or random effects) can be assumed to be discrete or to have a multivariate normal distribution. Examples of models in this class are multilevel generalized linear models or generalized linear mixed models, multilevel factor or latent trait models, item response models, latent class models and multilevel structural equation models. …


Data Adaptive Estimation Of The Treatment Specific Mean, Yue Wang, Oliver Bembom, Mark J. Van Der Laan Oct 2004

Data Adaptive Estimation Of The Treatment Specific Mean, Yue Wang, Oliver Bembom, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

An important problem in epidemiology and medical research is the estimation of the causal effect of a treatment action at a single point in time on the mean of an outcome, possibly within strata of the target population defined by a subset of the baseline covariates. Current approaches to this problem are based on marginal structural models, i.e., parametric models for the marginal distribution of counterfactural outcomes as a function of treatment and effect modifiers. The various estimators developed in this context furthermore each depend on a high-dimensional nuisance parameter whose estimation currently also relies on parametric models. Since misspecification …


History-Adjusted Marginal Structural Models And Statically-Optimal Dynamic Treatment Regimes, Mark J. Van Der Laan, Maya L. Petersen Sep 2004

History-Adjusted Marginal Structural Models And Statically-Optimal Dynamic Treatment Regimes, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a treatment. These models, introduced by Robins, model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at a final time point. However, the utility of these models for some applications has been limited by their inability to incorporate modification of the causal effect of treatment by time-varying covariates. …


Estimating A Survival Distribution With Current Status Data And High-Dimensional Covariates, Mark J. Van Der Laan, Aad Van Der Vaart Sep 2004

Estimating A Survival Distribution With Current Status Data And High-Dimensional Covariates, Mark J. Van Der Laan, Aad Van Der Vaart

U.C. Berkeley Division of Biostatistics Working Paper Series

We consider the inverse problem of estimating a survival distribution when the survival times are only observed to be in one of the intervals of a random bisection of the time axis. We are particularly interested in the case that high-dimensional and/or time-dependent covariates are available, and/or the survival events and censoring times are only conditionally independent given the covariate process. The method of estimation consists of regularizing the survival distribution by taking the primitive function or smoothing, estimating the regularized parameter by using estimating equations, and finally recovering an estimator for the parameter of interest.


Linear Life Expectancy Regression With Censored Data, Ying Qing Chen, Su-Chun Cheng Aug 2004

Linear Life Expectancy Regression With Censored Data, Ying Qing Chen, Su-Chun Cheng

U.C. Berkeley Division of Biostatistics Working Paper Series

Life expectancy, i.e., mean residual life function, has been of important practical and scientific interests to characterise the distribution of residual life. Regression models are often needed to model the association between life expectancy and its covariates. In this article, we consider a linear mean residual life model and further developed some inference procedures in presence of censoring. The new model and proposed inference procedure will be demonstrated by numerical examples and application to the well-known Stanford heart transplant data. Additional semiparametric efficiency calculation and information bound are also considered.


A Note On Empirical Likelihood Inference Of Residual Life Regression, Ying Qing Chen, Yichuan Zhao Jul 2004

A Note On Empirical Likelihood Inference Of Residual Life Regression, Ying Qing Chen, Yichuan Zhao

U.C. Berkeley Division of Biostatistics Working Paper Series

Mean residual life function, or life expectancy, is an important function to characterize distribution of residual life. The proportional mean residual life model by Oakes and Dasu (1990) is a regression tool to study the association between life expectancy and its associated covariates. Although semiparametric inference procedures have been proposed in the literature, the accuracy of such procedures may be low when the censoring proportion is relatively large. In this paper, the semiparametric inference procedures are studied with an empirical likelihood ratio method. An empirical likelihood confidence region is constructed for the regression parameters. The proposed method is further compared …