Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Physical Sciences and Mathematics

Survival Point Estimate Prediction In Matched And Non-Matched Case-Control Subsample Designed Studies, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore, Karla Kerlikowske Aug 2005

Survival Point Estimate Prediction In Matched And Non-Matched Case-Control Subsample Designed Studies, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore, Karla Kerlikowske

U.C. Berkeley Division of Biostatistics Working Paper Series

Providing information about the risk of disease and clinical factors that may increase or decrease a patient's risk of disease is standard medical practice. Although case-control studies can provide evidence of strong associations between diseases and risk factors, clinicians need to be able to communicate to patients the age-specific risks of disease over a defined time interval for a set of risk factors.

An estimate of absolute risk cannot be determined from case-control studies because cases are generally chosen from a population whose size is not known (necessary for calculation of absolute risk) and where duration of follow-up is not …


An Exploration Of Using Data Mining In Educational Research, Yonghong Jade Xu May 2005

An Exploration Of Using Data Mining In Educational Research, Yonghong Jade Xu

Journal of Modern Applied Statistical Methods

Technology advances popularized large databases in education. Traditional statistics have limitations for analyzing large quantities of data. This article discusses data mining by analyzing a data set with three models: multiple regression, data mining, and a combination of the two. It is concluded that data mining is applicable in educational research.


Survival Ensembles, Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, Annette M. Molinaro, Mark J. Van Der Laan Apr 2005

Survival Ensembles, Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, Annette M. Molinaro, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a unified and flexible framework for ensemble learning in the presence of censoring. For right-censored data, we introduce a random forest algorithm and a generic gradient boosting algorithm for the construction of prognostic models. The methodology is utilized for predicting the survival time of patients suffering from acute myeloid leukemia based on clinical and genetic covariates. Furthermore, we compare the diagnostic capabilities of the proposed censored data random forest and boosting methods applied to the recurrence free survival time of node positive breast cancer patients with previously published findings.


Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton Jan 2005

Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton

UW Biostatistics Working Paper Series

Introduction: Markers that purport to distinguish subjects with a condition from those without a condition must be evaluated rigorously for their classification accuracy. A single approach to statistically evaluating and comparing markers is not yet established.

Methods: We suggest a standardization that uses the marker distribution in unaffected subjects as a reference. For an affected subject with marker value Y, the standardized placement value is the proportion of unaffected subjects with marker values that exceed Y.

Results: We apply the standardization to two illustrative datasets. In patients with pancreatic cancer placement values calculated for the CA 19-9 marker are smaller …