Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Prediction (2)
- CART (1)
- Censored data (1)
- Classification (1)
- Classifier performance (1)
-
- Comparative genomic hybridization (1)
- Cross-validation (1)
- Density estimation (1)
- Diagnostic test (1)
- Disease screening (1)
- Loss function (1)
- Mixture model (1)
- Model selection (1)
- Multivariate outcome (1)
- Nested unsupervised analysis (1)
- Regression trees (1)
- Risk estimation (1)
- Survival analysis (1)
Articles 1 - 3 of 3
Full-Text Articles in Statistical Models
A Nested Unsupervised Approach To Identifying Novel Molecular Subtypes, Elizabeth Garrett, Giovanni Parmigiani
A Nested Unsupervised Approach To Identifying Novel Molecular Subtypes, Elizabeth Garrett, Giovanni Parmigiani
Johns Hopkins University, Dept. of Biostatistics Working Papers
In classification problems arising in genomics research it is common to study populations for which a broad class assignment is known (say, normal versus diseased) and one seeks to find undiscovered subclasses within one or both of the known classes. Formally, this problem can be thought of as an unsupervised analysis nested within a supervised one. Here we take the view that the nested unsupervised analysis can successfully utilize information from the entire data set for constructing and/or selecting useful predictors. Specifically, we propose a mixture model approach to the nested unsupervised problem, where the supervised information is used to …
Tree-Based Multivariate Regression And Density Estimation With Right-Censored Data , Annette M. Molinaro, Sandrine Dudoit, Mark J. Van Der Laan
Tree-Based Multivariate Regression And Density Estimation With Right-Censored Data , Annette M. Molinaro, Sandrine Dudoit, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
We propose a unified strategy for estimator construction, selection, and performance assessment in the presence of censoring. This approach is entirely driven by the choice of a loss function for the full (uncensored) data structure and can be stated in terms of the following three main steps. (1) Define the parameter of interest as the minimizer of the expected loss, or risk, for a full data loss function chosen to represent the desired measure of performance. Map the full data loss function into an observed (censored) data loss function having the same expected value and leading to an efficient estimator …
Semi-Parametric Regression For The Area Under The Receiver Operating Characteristic Curve, Lori E. Dodd, Margaret S. Pepe
Semi-Parametric Regression For The Area Under The Receiver Operating Characteristic Curve, Lori E. Dodd, Margaret S. Pepe
UW Biostatistics Working Paper Series
Medical advances continue to provide new and potentially better means for detecting disease. Such is true in cancer, for example, where biomarkers are sought for early detection and where improvements in imaging methods may pick up the initial functional and molecular changes associated with cancer development. In other binary classification tasks, computational algorithms such as Neural Networks, Support Vector Machines and Evolutionary Algorithms have been applied to areas as diverse as credit scoring, object recognition, and peptide-binding prediction. Before a classifier becomes an accepted technology, it must undergo rigorous evaluation to determine its ability to discriminate between states. Characterization of …