Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Entire DC Network

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

SUMMARY. We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model …


Analysis Of Case-Control Age-At-Onset Data Using A Modified Case-Cohort Method, Bin Nan, Xihong Lin Nov 2006

Analysis Of Case-Control Age-At-Onset Data Using A Modified Case-Cohort Method, Bin Nan, Xihong Lin

The University of Michigan Department of Biostatistics Working Paper Series

Case-control designs are widely used in rare disease studies. In a typical case-control study, data are collected from a sample of all available subjects who have experienced a disease (cases) and a sub-sample of subjects who have not experienced the disease (controls) in a study cohort. Cases are often oversampled in case-control studies. Logistic regression is a common tool to estimate the relative risks of the disease and a set of covariates. Very often in such a study, information of ages-at-onset of the disease for all cases and ages at survey of controls are known. Standard logistic regression analysis using …


Doubly Penalized Buckley-James Method For Survival Data With High-Dimensional Covariates, Sijian Wang, Bin Nan, Ji Zhu, David G. Beer Nov 2006

Doubly Penalized Buckley-James Method For Survival Data With High-Dimensional Covariates, Sijian Wang, Bin Nan, Ji Zhu, David G. Beer

The University of Michigan Department of Biostatistics Working Paper Series

Recent interest in cancer research focuses on predicting patients' survival by investigating gene expression profiles based on microarray analysis. We propose a doubly penalized Buckley-James method for the semiparametric accelerated failure time model to relate high-dimensional genomic data to censored survival outcomes, which uses a mixture of L1-norm and L2-norm penalties. Similar to the elastic-net method for linear regression model with uncensored data, the proposed method performs automatic gene selection and parameter estimation, where highly correlated genes are able to be selected (or removed) together. The two-dimensional tuning parameter is determined by cross-validation and uniform design. …


A Note On Bias Due To Fitting Prospective Multivariate Generalized Linear Models To Categorical Outcomes Ignoring Retrospective Sampling Schemes, Bhramar Mukherjee, Ivy Liu Nov 2006

A Note On Bias Due To Fitting Prospective Multivariate Generalized Linear Models To Categorical Outcomes Ignoring Retrospective Sampling Schemes, Bhramar Mukherjee, Ivy Liu

The University of Michigan Department of Biostatistics Working Paper Series

Outcome dependent sampling designs are commonly used in economics, market research and epidemiological studies. Case-control sampling design is a classic example of outcome dependent sampling, where exposure information is collected on subjects conditional on their disease status. In many situations, the outcome under consideration may have multiple categories instead of a simple dichotomization. For example, in a case-control study, there may be disease sub-classification among the “cases” based on progression of the disease, or in terms of other histological and morphological characteristics of the disease. In this note, we investigate the issue of fitting prospective multivariate generalized linear models to …


Exploiting Gene-Environment Independence For Analysis Of Case-Control Studies: An Empirical Bayes Approach To Trade Off Between Bias And Efficiency, Bhramar Mukherjee, Nilanjan Chatterjee Nov 2006

Exploiting Gene-Environment Independence For Analysis Of Case-Control Studies: An Empirical Bayes Approach To Trade Off Between Bias And Efficiency, Bhramar Mukherjee, Nilanjan Chatterjee

The University of Michigan Department of Biostatistics Working Paper Series

Standard prospective logistic regression analysis of case-control data often leads to very imprecise estimates of gene-environment interactions due to small numbers of cases or controls in cells of crossing genotype and exposure. In contrast, under the assumption of gene-environment independence, modern “retrospective” methods, including the “case-only” approach, can estimate the interaction parameters much more precisely, but they can be seriously biased when the underlying assumption of gene-environment independence is violated. In this article, we propose a novel approach to analyze case-control data that can relax the gene-environment independence assumption using an empirical Bayes framework. In the special case, involving a …


Simultaneously Optimizing Dose And Schedule Of A New Cytotoxic Agent, Thomas M. Braun, Peter F. Thall, Hoang Nguyen, Marcos De Lima Aug 2006

Simultaneously Optimizing Dose And Schedule Of A New Cytotoxic Agent, Thomas M. Braun, Peter F. Thall, Hoang Nguyen, Marcos De Lima

The University of Michigan Department of Biostatistics Working Paper Series

Traditionally, phase I clinical trial designs determine a maximum tolerated dose of an experimental cytotoxic agent based on a fixed schedule, usually one course consisting of multiple administrations, while varying the dose per administration between patients. However, in actual medical practice patients often receive several courses of treatment, and some patients may receive one or more dose reductions due to low-grade (non-dose limiting) toxicity in previous courses. As a result, the overall risk of toxicity for each patient is a function of both the schedule and the dose used at each adminstration. We propose a new paradigm for Phase I …


Generalized Monotonic Functional Mixed Models With Application To Modeling Normal Tissue Complications , Matthew Schipper, Jeremy Taylor, Xihong Lin Aug 2006

Generalized Monotonic Functional Mixed Models With Application To Modeling Normal Tissue Complications , Matthew Schipper, Jeremy Taylor, Xihong Lin

The University of Michigan Department of Biostatistics Working Paper Series

Normal tissue complications are a common side effect of radiation therapy. They are the consequence of the dose of radiation received by the normal tissue surrounding the tumor site. It is not known what function of the dose distribution to the normal tissue drives the presence and severity of the complications. Regarding the density of the dose distribution as a curve, a summary measure is obtained by integrating a weighting function of dose (w(d)) over the dose density. For biological reasons the weight function should be monotonic. We propose to study the dose effect on a clinical outcome using a …


Permutation Methods In Relative Risk Regression Models, Wenyu Jiang, Jack Kalbfleisch Jul 2006

Permutation Methods In Relative Risk Regression Models, Wenyu Jiang, Jack Kalbfleisch

The University of Michigan Department of Biostatistics Working Paper Series

In this paper, we develop a weighted permutation (WP) method to construct confidence intervals for regression parameters in relative risk regression models. The WP method is a generalized permutation approach. It constructs a resampled history which mimics the observed history for individuals under study. Inference procedures are based on studentized score statistics that are insensitive to the forms of the relative risk function. This makes the WP method appealing in the general framework of the relative risk regression model. First order accuracy of the WP method is established using the counting process approach with a partial likelihood filtration. A simulation …


Multiple Imputation In The Presence Of Outliers, Michael Elliott May 2006

Multiple Imputation In The Presence Of Outliers, Michael Elliott

The University of Michigan Department of Biostatistics Working Paper Series

We consider the problem of obtaining population-based inference in the presence of missing data and outliers in the context of estimating obesity prevalence and body-mass index (BMI) measures from the Healthy For Life Study. Identifying multiple outliers in a multivariate setting is problematic because of problems such as masking, in which groups of outliers inflate the covariance matrix in a fashion that prevents their identification when included, and swamping, in which outliers skew covariances in a fashion that make non-outling observations appear to be outliers. We develop a latent class model that assumes each observation belongs to one of $K$ …


Combining Information From Two Surveys To Estimate County-Level Prevalence Rates Of Cancer Risk Factors And Screening, Trivellore E. Raghuanthan, Dawei Xie, Nathaniel Schenker, Van Parsons, William W. Davis, Kevin W. Dodd, Eric J. Feuer May 2006

Combining Information From Two Surveys To Estimate County-Level Prevalence Rates Of Cancer Risk Factors And Screening, Trivellore E. Raghuanthan, Dawei Xie, Nathaniel Schenker, Van Parsons, William W. Davis, Kevin W. Dodd, Eric J. Feuer

The University of Michigan Department of Biostatistics Working Paper Series

Cancer surveillance requires estimates of the prevalence of cancer risk factors and screening for small areas such as counties. Two popular data sources are the Behavioral Risk Factor Surveillance System (BRFSS), a telephone survey conducted by state agencies, and the National Health Interview Survey (NHIS), an area probability sample survey conducted through face-to-face interviews. Both data sources have advantages and disadvantages. The BRFSS is a larger survey, and almost every county is included in the survey; but it has lower response rates as is typical with telephone surveys, and it does not include subjects who live in households with no …


Detecting Pulsatile Hormone Secretion Events: A Bayesian Approach, Tim Johnson Mar 2006

Detecting Pulsatile Hormone Secretion Events: A Bayesian Approach, Tim Johnson

The University of Michigan Department of Biostatistics Working Paper Series

Many challenges arise in the analysis of pulsatile, or episodic, hormone concentration time series data. Among these challenges is the determination of the number and location of pulsatile events and the discrimination of events from noise. Analyses of these data are typically performed in two stages. In the first stage, the number and approximate location of the pulses are determined. In the second stage, a model (typically a deconvolution model) is fit to the data conditional on the number of pulses. Any error made in the first stage is carried over to the second stage. Furthermore, current methods, except two, …


Semiparametric Analysis For Correlated Recurrent And Terminal Events, Yining Ye, Jack Kalbfleisch, Doug E. Schaubel Mar 2006

Semiparametric Analysis For Correlated Recurrent And Terminal Events, Yining Ye, Jack Kalbfleisch, Doug E. Schaubel

The University of Michigan Department of Biostatistics Working Paper Series

In clinical and observational studies, recurrent event data (e.g. hospitalization) with a terminal event (e.g. death) are often encountered. In many instances, the terminal event is strongly correlated with the recurrent event process. In this article, we propose a semiparametric method to jointly model the recurrent and terminal event processes. The dependence is modeled by a shared gamma frailty that is included in both the recurrent event rate and terminal event hazard function. Marginal models are used to estimate the regression effects on the terminal and recurrent event processes and a Poisson model is used to estimate the dispersion of …