Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 5 of 5
Full-Text Articles in Statistical Methodology
Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma
Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma
UW Biostatistics Working Paper Series
Analysis of two-way structured data, i.e., data with structures among both variables and samples, is becoming increasingly common in ecology, biology and neuro-science. Classical dimension-reduction tools, such as the singular value decomposition (SVD), may perform poorly for two-way structured data. The generalized matrix decomposition (GMD, Allen et al., 2014) extends the SVD to two-way structured data and thus constructs singular vectors that account for both structures. While the GMD is a useful dimension-reduction tool for exploratory analysis of two-way structured data, it is unsupervised and cannot be used to assess the association between such data and an outcome of interest. …
Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng
Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng
U.C. Berkeley Division of Biostatistics Working Paper Series
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions, with typically unknown and intricate correlation patterns among variables. Addressing these inference questions satisfactorily requires: (i) an intensive and thorough search of the parameter space to generate good candidate estimators, (ii) an approach for selecting an optimal estimator among these candidates, and (iii) a method for reliably assessing the performance of the resulting estimator. We propose a unified loss-based methodology for estimator construction, selection, and performance assessment with cross-validation. In this approach, the parameter of interest is defined as the risk minimizer for a suitable …
Tree-Based Multivariate Regression And Density Estimation With Right-Censored Data , Annette M. Molinaro, Sandrine Dudoit, Mark J. Van Der Laan
Tree-Based Multivariate Regression And Density Estimation With Right-Censored Data , Annette M. Molinaro, Sandrine Dudoit, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
We propose a unified strategy for estimator construction, selection, and performance assessment in the presence of censoring. This approach is entirely driven by the choice of a loss function for the full (uncensored) data structure and can be stated in terms of the following three main steps. (1) Define the parameter of interest as the minimizer of the expected loss, or risk, for a full data loss function chosen to represent the desired measure of performance. Map the full data loss function into an observed (censored) data loss function having the same expected value and leading to an efficient estimator …
Semi-Parametric Regression For The Area Under The Receiver Operating Characteristic Curve, Lori E. Dodd, Margaret S. Pepe
Semi-Parametric Regression For The Area Under The Receiver Operating Characteristic Curve, Lori E. Dodd, Margaret S. Pepe
UW Biostatistics Working Paper Series
Medical advances continue to provide new and potentially better means for detecting disease. Such is true in cancer, for example, where biomarkers are sought for early detection and where improvements in imaging methods may pick up the initial functional and molecular changes associated with cancer development. In other binary classification tasks, computational algorithms such as Neural Networks, Support Vector Machines and Evolutionary Algorithms have been applied to areas as diverse as credit scoring, object recognition, and peptide-binding prediction. Before a classifier becomes an accepted technology, it must undergo rigorous evaluation to determine its ability to discriminate between states. Characterization of …
The Analysis Of Placement Values For Evaluating Discriminatory Measures, Margaret S. Pepe, Tianxi Cai
The Analysis Of Placement Values For Evaluating Discriminatory Measures, Margaret S. Pepe, Tianxi Cai
UW Biostatistics Working Paper Series
The idea of using measurements such as biomarkers, clinical data, or molecular biology assays for classification and prediction is popular in modern medicine. The scientific evaluation of such measures includes assessing the accuracy with which they predict the outcome of interest. Receiver operating characteristic curves are commonly used for evaluating the accuracy of diagnostic tests. They can be applied more broadly, indeed to any problem involving classification to two states or populations (D = 0 or D = 1). We show that the ROC curve can be interpreted as a cumulative distribution function for the discriminatory measure Y in the …