Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- (Quasi)separation (1)
- Classification (1)
- Cross-validation (1)
- Discriminant Analysis (1)
- Expectation-maximization algorithm (1)
-
- Firth's procedure (1)
- Gene expression (1)
- Gene expression data (1)
- High-dimensional data (1)
- Iteratively Reweighted Partial Least Squares (1)
- Knot selection (1)
- Likelihood (1)
- Logistic (1)
- MCMC (1)
- Mixed models (1)
- Mixture models (1)
- Multivariate smoothing (1)
- Receiver Operating Characteristic Curve (1)
- Spatially adaptive penalty (1)
- Thin-plate splines (1)
- Two-stage PLS (1)
Articles 1 - 4 of 4
Full-Text Articles in Statistical Models
Spatially Adaptive Bayesian P-Splines With Heteroscedastic Errors, Ciprian M. Crainiceanu, David Ruppert, Raymond J. Carroll
Spatially Adaptive Bayesian P-Splines With Heteroscedastic Errors, Ciprian M. Crainiceanu, David Ruppert, Raymond J. Carroll
Johns Hopkins University, Dept. of Biostatistics Working Papers
An increasingly popular tool for nonparametric smoothing are penalized splines (P-splines) which use low-rank spline bases to make computations tractable while maintaining accuracy as good as smoothing splines. This paper extends penalized spline methodology by both modeling the variance function nonparametrically and using a spatially adaptive smoothing parameter. These extensions have been studied before, but never together and never in the multivariate case. This combination is needed for satisfactory inference and can be implemented effectively by Bayesian \mbox{MCMC}. The variance process controlling the spatially-adaptive shrinkage of the mean and the variance of the heteroscedastic error process are modeled as log-penalized …
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh
The University of Michigan Department of Biostatistics Working Paper Series
One of the benefits of profiling of cancer samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Such subgroups have typically been found in microarray data using hierarchical clustering. A major problem in interpretation of the output is determining the number of clusters. We approach the problem of determining disease subtypes using mixture models. A novel estimation procedure of the parameters in the mixture model is developed based on a combination of random projections and the expectation-maximization algorithm. Because the approach is probabilistic, our approach provides a measure for the number of true clusters …
Combining Predictors For Classification Using The Area Under The Roc Curve, Margaret S. Pepe, Tianxi Cai, Zheng Zhang
Combining Predictors For Classification Using The Area Under The Roc Curve, Margaret S. Pepe, Tianxi Cai, Zheng Zhang
UW Biostatistics Working Paper Series
We compare simple logistic regression with an alternative robust procedure for constructing linear predictors to be used for the two state classification task. Theoritical advantages of the robust procedure over logistic regression are: (i) although it assumes a generalized linear model for the dichotomous outcome variable, it does not require specification of the link function; (ii) it accommodates case-control designs even when the model is not logistic; and (iii) it yields sensible results even when the generalized linear model assumption fails to hold. Surprisingly, we find that the linear predictor derived from the logistic regression likelihood is very robust in …
Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman
Classification Using Generalized Partial Least Squares, Beiying Ding, Robert Gentleman
Bioconductor Project Working Papers
The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in …