Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 4 of 4
Full-Text Articles in Physical Sciences and Mathematics
Survival Point Estimate Prediction In Matched And Non-Matched Case-Control Subsample Designed Studies, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore, Karla Kerlikowske
Survival Point Estimate Prediction In Matched And Non-Matched Case-Control Subsample Designed Studies, Annette M. Molinaro, Mark J. Van Der Laan, Dan H. Moore, Karla Kerlikowske
U.C. Berkeley Division of Biostatistics Working Paper Series
Providing information about the risk of disease and clinical factors that may increase or decrease a patient's risk of disease is standard medical practice. Although case-control studies can provide evidence of strong associations between diseases and risk factors, clinicians need to be able to communicate to patients the age-specific risks of disease over a defined time interval for a set of risk factors.
An estimate of absolute risk cannot be determined from case-control studies because cases are generally chosen from a population whose size is not known (necessary for calculation of absolute risk) and where duration of follow-up is not …
An Exploration Of Using Data Mining In Educational Research, Yonghong Jade Xu
An Exploration Of Using Data Mining In Educational Research, Yonghong Jade Xu
Journal of Modern Applied Statistical Methods
Technology advances popularized large databases in education. Traditional statistics have limitations for analyzing large quantities of data. This article discusses data mining by analyzing a data set with three models: multiple regression, data mining, and a combination of the two. It is concluded that data mining is applicable in educational research.
Survival Ensembles, Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, Annette M. Molinaro, Mark J. Van Der Laan
Survival Ensembles, Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, Annette M. Molinaro, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
We propose a unified and flexible framework for ensemble learning in the presence of censoring. For right-censored data, we introduce a random forest algorithm and a generic gradient boosting algorithm for the construction of prognostic models. The methodology is utilized for predicting the survival time of patients suffering from acute myeloid leukemia based on clinical and genetic covariates. Furthermore, we compare the diagnostic capabilities of the proposed censored data random forest and boosting methods applied to the recurrence free survival time of node positive breast cancer patients with previously published findings.
Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton
Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton
UW Biostatistics Working Paper Series
Introduction: Markers that purport to distinguish subjects with a condition from those without a condition must be evaluated rigorously for their classification accuracy. A single approach to statistically evaluating and comparing markers is not yet established.
Methods: We suggest a standardization that uses the marker distribution in unaffected subjects as a reference. For an affected subject with marker value Y, the standardized placement value is the proportion of unaffected subjects with marker values that exceed Y.
Results: We apply the standardization to two illustrative datasets. In patients with pancreatic cancer placement values calculated for the CA 19-9 marker are smaller …