Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Statistics and Probability

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch Nov 2006

Multiple Testing With An Empirical Alternative Hypothesis, James E. Signorovitch

Harvard University Biostatistics Working Paper Series

An optimal multiple testing procedure is identified for linear hypotheses under the general linear model, maximizing the expected number of false null hypotheses rejected at any significance level. The optimal procedure depends on the unknown data-generating distribution, but can be consistently estimated. Drawing information together across many hypotheses, the estimated optimal procedure provides an empirical alternative hypothesis by adapting to underlying patterns of departure from the null. Proposed multiple testing procedures based on the empirical alternative are evaluated through simulations and an application to gene expression microarray data. Compared to a standard multiple testing procedure, it is not unusual for …


Bayesian Hidden Markov Modeling Of Array Cgh Data, Subharup Guha, Yi Li, Donna Neuberg Oct 2006

Bayesian Hidden Markov Modeling Of Array Cgh Data, Subharup Guha, Yi Li, Donna Neuberg

Harvard University Biostatistics Working Paper Series

Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for …


Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li Sep 2006

Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


Predicting Future Responses Based On Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, L.J. Wei Aug 2006

Predicting Future Responses Based On Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, L.J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Evaluating Prediction Rules For T-Year Survivors With Censored Regression Models, Hajime Uno, Tianxi Cai, Lu Tian, L.J. Wei Mar 2006

Evaluating Prediction Rules For T-Year Survivors With Censored Regression Models, Hajime Uno, Tianxi Cai, Lu Tian, L.J. Wei

Harvard University Biostatistics Working Paper Series

Suppose that we are interested in establishing simple, but reliable rules for predicting future t-year survivors via censored regression models. In this article, we present inference procedures for evaluating such binary classification rules based on various prediction precision measures quantified by the overall misclassification rate, sensitivity and specificity, and positive and negative predictive values. Specifically, under various working models we derive consistent estimators for the above measures via substitution and cross validation estimation procedures. Furthermore, we provide large sample approximations to the distributions of these nonsmooth estimators without assuming that the working model is correctly specified. Confidence intervals, for example, …


Regression Analysis For The Partial Area Under The Roc Curve, Tianxi Cai, Lori E. Dodd Feb 2006

Regression Analysis For The Partial Area Under The Roc Curve, Tianxi Cai, Lori E. Dodd

Harvard University Biostatistics Working Paper Series

No abstract provided.