Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 55

Full-Text Articles in Statistics and Probability

Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma Dec 2019

Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma

UW Biostatistics Working Paper Series

Analysis of two-way structured data, i.e., data with structures among both variables and samples, is becoming increasingly common in ecology, biology and neuro-science. Classical dimension-reduction tools, such as the singular value decomposition (SVD), may perform poorly for two-way structured data. The generalized matrix decomposition (GMD, Allen et al., 2014) extends the SVD to two-way structured data and thus constructs singular vectors that account for both structures. While the GMD is a useful dimension-reduction tool for exploratory analysis of two-way structured data, it is unsupervised and cannot be used to assess the association between such data and an outcome of interest. …


Statistical Inference For Networks Of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, Ali Shojaie Dec 2019

Statistical Inference For Networks Of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, Ali Shojaie

UW Biostatistics Working Paper Series

Fueled in part by recent applications in neuroscience, high-dimensional Hawkes process have become a popular tool for modeling the network of interactions among multivariate point process data. While evaluating the uncertainty of the network estimates is critical in scientific applications, existing methodological and theoretical work have only focused on estimation. To bridge this gap, this paper proposes a high-dimensional statistical inference procedure with theoretical guarantees for multivariate Hawkes process. Key to this inference procedure is a new concentration inequality on the first- and second-order statistics for integrated stochastic processes, which summarizes the entire history of the process. We apply this …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Nonparametric Identifiability Of Finite Mixture Models With Covariates For Estimating Error Rate Without A Gold Standard, Zheyu Wang, Xiao-Hua Zhou Apr 2014

Nonparametric Identifiability Of Finite Mixture Models With Covariates For Estimating Error Rate Without A Gold Standard, Zheyu Wang, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Finite mixture models provide a flexible framework to study unobserved entities and have arisen in many statistical applications. The flexibility of these models in adapting various complicated structures makes it crucial to establish model identifiability when applying them in practice to ensure study validity and interpretation. However, researches to establish the identifiability of finite mixture model are limited and are usually restricted to a few specific model configurations. Conditions for model identifiability in the general case have not been established. In this paper, we provide conditions for both local identifiability and global identifiability of a finite mixture model. The former …


Oracle And Multiple Robustness Properties Of Survey Calibration Estimator In Missing Response Problem, Kwun Chuen Gary Chan Dec 2010

Oracle And Multiple Robustness Properties Of Survey Calibration Estimator In Missing Response Problem, Kwun Chuen Gary Chan

UW Biostatistics Working Paper Series

In the presence of missing response, reweighting the complete case subsample by the inverse of nonmissing probability is both intuitive and easy to implement. However, inverse probability weighting is not efficient in general and is not robust against misspecification of the missing probability model. Calibration was developed by survey statisticians for improving efficiency of inverse probability weighting estimators when population totals of auxiliary variables are known and when inclusion probability is known by design. In missing data problem we can calibrate auxiliary variables in the complete case subsample to the full sample. However, the inclusion probability is unknown in general …


Modification And Improvement Of Empirical Likelihood For Missing Response Problem, Kwun Chuen Gary Chan Dec 2010

Modification And Improvement Of Empirical Likelihood For Missing Response Problem, Kwun Chuen Gary Chan

UW Biostatistics Working Paper Series

An empirical likelihood (EL) estimator was proposed by Qin and Zhang (2007) for a missing response problem under a missing at random assumption. They showed by simulation studies that the finite sample performance of EL estimator is better than some existing estimators. However, the empirical likelihood estimator does not have a uniformly smaller asymptotic variance than other estimators in general. We consider several modifications to the empirical likelihood estimator and show that the proposed estimator dominates the empirical likelihood estimator and several other existing estimators in terms of asymptotic efficiencies. The proposed estimator also attains the minimum asymptotic variance among …


Modification And Improvement Of Empirical Liklihood For Missing Response Problem, Gary Chan Dec 2010

Modification And Improvement Of Empirical Liklihood For Missing Response Problem, Gary Chan

UW Biostatistics Working Paper Series

An empirical likelihood (EL) estimator was proposed by Qin and Zhang (2007) for a missing response problem under a missing at random assumption. They showed by simulation studies that the finite sample performance of EL estimator is better than some existing estimators. However, the empirical likelihood estimator does not have a uniformly smaller asymptotic variance than other estimators in general. We consider several modifications to the empirical likelihood estimator and show that the proposed estimator dominates the empirical likelihood estimator and several other existing estimators in terms of asymptotic efficiencies. The proposed estimator also attains the minimum asymptotic variance among …


Efficient Measurement Error Correction With Spatially Misaligned Data, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley Dec 2010

Efficient Measurement Error Correction With Spatially Misaligned Data, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley

UW Biostatistics Working Paper Series

Association studies in environmental statistics often involve exposure and outcome data that are misaligned in space. A common strategy is to employ a spatial model such as universal kriging to predict exposures at locations with outcome data and then estimate a regression parameter of interest using the predicted exposures. This results in measurement error because the predicted exposures do not correspond exactly to the true values. We characterize the measurement error by decomposing it into Berkson-like and classical-like components. One correction approach is the parametric bootstrap, which is effective but computationally intensive since it requires solving a nonlinear optimization problem …


On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Dai, Charles Kooperberg, Michael L. Leblanc, Ross Prentice Sep 2010

On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Dai, Charles Kooperberg, Michael L. Leblanc, Ross Prentice

UW Biostatistics Working Paper Series

Kooperberg and LeBlanc (2008) proposed a two-stage testing procedure to screen for significant interactions in genome-wide association (GWA) studies by a soft threshold on marginal associations (MA), though its theoretical properties and generalization have not been elaborated. In this article, we discuss conditions that are required to achieve strong control of the Family-Wise Error Rate (FWER) by such procedures for low or high-dimensional hypothesis testing. We provide proof of asymptotic independence of marginal association statistics and interaction statistics in linear regression, logistic regression, and Cox proportional hazard models in a randomized clinical trial (RCT) with a rare event. In case-control …


On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Y. Dai, Charles Kooperberg, Michael Leblanc, Ross L. Prentice Aug 2010

On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Y. Dai, Charles Kooperberg, Michael Leblanc, Ross L. Prentice

UW Biostatistics Working Paper Series

Kooperberg08 proposed a two-stage testing procedure to screen for significant interactions in genome-wide association (GWA) studies by a soft threshold on marginal associations (MA), though its theoretical properties and generalization have not been elaborated. In this article, we discuss conditions that are required to achieve strong control of the Family-Wise Error Rate (FWER) by such procedures for low or high-dimensional hypothesis testing. We provide proof of asymptotic independence of marginal association statistics and interaction statistics in linear regression, logistic regression, and Cox proportional hazard models in a randomized clinical trial (RCT) with a rare event. In case-control studies nested within …


Model-Robust Regression And A Bayesian `Sandwich' Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley May 2010

Model-Robust Regression And A Bayesian `Sandwich' Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley

UW Biostatistics Working Paper Series

The published version of this paper in Annals of Applied Statistics (Vol. 4, No. 4 (2010), 2099–2113) is available from the journal web site at http://dx.doi.org/10.1214/10-AOAS362.

We present a new Bayesian approach to model-robust linear regression that leads to uncertainty estimates with the same robustness properties as the Huber-White sandwich estimator. The sandwich estimator is known to provide asymptotically correct frequentist inference, even when standard modeling assumptions such as linearity and homoscedasticity in the data-generating mechanism are violated. Our derivation provides a compelling Bayesian justification for using this simple and popular tool, and it also clarifies what is being estimated …


Asymptotic Properties Of The Sequential Empirical Roc And Ppv Curves, Joseph S. Koopmeiners, Ziding Feng May 2010

Asymptotic Properties Of The Sequential Empirical Roc And Ppv Curves, Joseph S. Koopmeiners, Ziding Feng

UW Biostatistics Working Paper Series

The receiver operating characteristic (ROC) curve, the positive predictive value (PPV) curve and the negative predictive value (NPV) curve are three common measures of performance for a diagnostic biomarker. The independent increments covariance structure assumption is common in the group sequential study design literature. Showing that summary measures of the ROC, PPV and NPV curves have an independent increments covariance structure will provide the theoretical foundation for designing group sequential diagnostic biomarker studies. The ROC, PPV and NPV curves are often estimated empirically to avoid assumptions about the distributional form of the biomarkers. In this paper we derive asymptotic theory …


Pragmatic Estimation Of A Spatio-Temporal Air Quality Model With Irregular Monitoring Data, Paul D. Sampson, Adam A. Szpiro, Lianne Sheppard, Johan Lindström, Joel D. Kaufman Nov 2009

Pragmatic Estimation Of A Spatio-Temporal Air Quality Model With Irregular Monitoring Data, Paul D. Sampson, Adam A. Szpiro, Lianne Sheppard, Johan Lindström, Joel D. Kaufman

UW Biostatistics Working Paper Series

Statistical analyses of the health effects of air pollution have increasingly used GIS-based covariates for prediction of ambient air quality in “land-use” regression models. More recently these regression models have accounted for spatial correlation structure in combining monitoring data with land-use covariates. The current paper builds on these concepts to address spatio-temporal prediction of ambient concentrations of particulate matter with aerodynamic diameter less than 2.5 μm (PM2.5) on the basis of a model representing spatially varying seasonal trends and spatial correlation structures. Our hierarchical methodology provides a pragmatic approach that fully exploits regulatory and other supplemental monitoring data which jointly …


Measures To Summarize And Compare The Predictive Capacity Of Markers, Wen Gu, Margaret Pepe Feb 2009

Measures To Summarize And Compare The Predictive Capacity Of Markers, Wen Gu, Margaret Pepe

UW Biostatistics Working Paper Series

The predictive capacity of a marker in a population can be described using the population distribution of risk (Huang et al., 2007; Pepe et al., 2008a; Stern, 2008). Virtually all standard statistical summaries of predictability and discrimination can be derived from it (Gail and Pfeiffer, 2005). The goal of this paper is to develop methods for making inference about risk prediction markers using summary measures derived from the risk distribution. We describe some new clinically motivated summary measures and give new interpretations to some existing statistical measures. Methods for estimating these summary measures are described along with distribution theory that …


Trading Bias For Precision: Decision Theory For Intervals And Sets, Kenneth M. Rice, Thomas Lumley, Adam A. Szpiro Aug 2008

Trading Bias For Precision: Decision Theory For Intervals And Sets, Kenneth M. Rice, Thomas Lumley, Adam A. Szpiro

UW Biostatistics Working Paper Series

Interval- and set-valued decisions are an essential part of statistical inference. Despite this, the justification behind them is often unclear, leading in practice to a great deal of confusion about exactly what is being presented. In this paper we review and attempt to unify several competing methods of interval-construction, within a formal decision-theoretic framework. The result is a new emphasis on interval-estimation as a distinct goal, and not as an afterthought to point estimation. We also see that representing intervals as trade-offs between measures of precision and bias unifies many existing approaches -- as well as suggesting interpretable criteria to …


Accounting For Errors From Predicting Exposures In Environmental Epidemiology And Environmental Statistics, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley Jun 2008

Accounting For Errors From Predicting Exposures In Environmental Epidemiology And Environmental Statistics, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley

UW Biostatistics Working Paper Series

PLEASE NOTE THAT AN UPDATED VERSION OF THIS RESEARCH IS AVAILABLE AS WORKING PAPER 350 IN THE UNIVERSITY OF WASHINGTON BIOSTATISTICS WORKING PAPER SERIES (http://www.bepress.com/uwbiostat/paper350).

In environmental epidemiology and related problems in environmental statistics, it is typically not practical to directly measure the exposure for each subject. Environmental monitoring is employed with a statistical model to assign exposures to individuals. The result is a form of exposure misspecification that can result in complicated errors in the health effect estimates if the exposure is naively treated as known. The exposure error is neither “classical” nor “Berkson”, so standard regression calibration methods …


Model-Robust Bayesian Regression And The Sandwich Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley Dec 2007

Model-Robust Bayesian Regression And The Sandwich Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley

UW Biostatistics Working Paper Series

PLEASE NOTE THAT AN UPDATED VERSION OF THIS RESEARCH IS AVAILABLE AS WORKING PAPER 338 IN THE UNIVERSITY OF WASHINGTON BIOSTATISTICS WORKING PAPER SERIES (http://www.bepress.com/uwbiostat/paper338).

In applied regression problems there is often sufficient data for accurate estimation, but standard parametric models do not accurately describe the source of the data, so associated uncertainty estimates are not reliable. We describe a simple Bayesian approach to inference in linear regression that recovers least-squares point estimates while providing correct uncertainty bounds by explicitly recognizing that standard modeling assumptions need not be valid. Our model-robust development parallels frequentist estimating equations and leads to intervals …


Estimating Sensitivity And Specificity From A Phase 2 Biomarker Study That Allows For Early Termination, Margaret S. Pepe Phd Dec 2007

Estimating Sensitivity And Specificity From A Phase 2 Biomarker Study That Allows For Early Termination, Margaret S. Pepe Phd

UW Biostatistics Working Paper Series

Development of a disease screening biomarker involves several phases. In phase 2 its sensitivity and specificity is compared with established thresholds for minimally acceptable performance. Since we anticipate that most candidate markers will not prove to be useful and availability of specimens and funding is limited, early termination of a study is appropriate if accumulating data indicate that the marker is inadequate. Yet, for markers that complete phase 2, we seek estimates of sensitivity and specificity to proceed with the design of subsequent phase 3 studies.

We suggest early stopping criteria and estimation procedures that adjust for bias caused by …


Evaluating The Roc Performance Of Markers For Future Events, Margaret Pepe, Yingye Zheng, Yuying Jin May 2007

Evaluating The Roc Performance Of Markers For Future Events, Margaret Pepe, Yingye Zheng, Yuying Jin

UW Biostatistics Working Paper Series

Receiver operating characteristic (ROC) curves play a central role in the evaluation of biomarkers and tests for disease diagnosis. Predictors for event time outcomes can also be evaluated with ROC curves, but the time lag between marker measurement and event time must be acknowledged. We discuss different definitions of time-dependent ROC curves in the context of real applications. Several approaches have been proposed for estimation. We contrast retrospective versus prospective methods in regards to assumptions and flexibility, including their capacities to incorporate censored data, competing risks and different sampling schemes. Applications to two datasets are presented.


Large Cluster Asymptotics For Gee: Working Correlation Models, Hyoju Chung, Thomas Lumley Oct 2006

Large Cluster Asymptotics For Gee: Working Correlation Models, Hyoju Chung, Thomas Lumley

UW Biostatistics Working Paper Series

This paper presents large cluster asymptotic results for generalized estimating equations. The complexity of working correlation model is characterized in terms of the number of working correlation components to be estimated. When the cluster size is relatively large, we may encounter a situation where a high-dimensional working correlation matrix is modeled and estimated from the data. In the present asymptotic setting, the cluster size and the complexity of working correlation model grow with the number of independent clusters. We show the existence, weak consistency and asymptotic normality of marginal regression parameter estimators using the results of empirical process theory and …


The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield Jul 2006

The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield

UW Biostatistics Working Paper Series

Ecological studies, in which data are available at the level of the group, rather than at the level of the individual, are susceptible to a range of biases due to their inability to characterize within-group variability in exposures and confounders. In order to overcome these biases, we propose a hybrid design in which ecological data are supplemented with a sample of individual-level case-control data. We develop the likelihood for this design and illustrate its benefits via simulation, both in bias reduction when compared to an ecological study, and in efficiency gains relative to a conventional case-control study. An interesting special …


The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield Jul 2006

The Combination Of Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield

UW Biostatistics Working Paper Series

Ecological studies, in which data are available at the level of the group, rather than at the level of the individual, are susceptible to a range of biases due to their inability to characterize within-group variability in exposures and confounders. In order to overcome these biases, we propose a hybrid design in which ecological data are supplemented with a sample of individual-level case-control data. We develop the likelihood for this design and illustrate its benefits via simulation, both in bias reduction when compared to an ecological study, and in efficiency gains relative to a conventional case-control study. An interesting special …


The Two-Sample Problem For Failure Rates Depending On A Continuous Mark: An Application To Vaccine Efficacy, Peter B. Gilbert, Ian W. Mckeague, Yanqing Sun Mar 2006

The Two-Sample Problem For Failure Rates Depending On A Continuous Mark: An Application To Vaccine Efficacy, Peter B. Gilbert, Ian W. Mckeague, Yanqing Sun

UW Biostatistics Working Paper Series

The efficacy of an HIV vaccine to prevent infection is likely to depend on the genetic variation of the exposing virus. This paper addresses the problem of using data on the HIV sequences that infect vaccine efficacy trial participants to 1) test for vaccine efficacy more powerfully than procedures that ignore the sequence data; and 2) evaluate the dependence of vaccine efficacy on the divergence of infecting HIV strains from the HIV strain that is contained in the vaccine. Because hundreds of amino acid sites in each HIV genome are sequenced, it is natural to treat the divergence (defined in …


Alleviating Linear Ecological Bias And Optimal Design With Subsample Data, Adam Glynn, Jon Wakefield, Mark Handcock, Thomas Richardson Dec 2005

Alleviating Linear Ecological Bias And Optimal Design With Subsample Data, Adam Glynn, Jon Wakefield, Mark Handcock, Thomas Richardson

UW Biostatistics Working Paper Series

In this paper, we illustrate that combining ecological data with subsample data in situations in which a linear model is appropriate provides three main benefits. First, by including the individual level subsample data, the biases associated with linear ecological inference can be eliminated. Second, by supplementing the subsample data with ecological data, the information about parameters will be increased. Third, we can use readily available ecological data to design optimal subsampling schemes, so as to further increase the information about parameters. We present an application of this methodology to the classic problem of estimating the effect of a college degree …


Empirical Likelihood Inference For The Area Under The Roc Curve, Gengsheng Qin, Xiao-Hua Zhou Dec 2005

Empirical Likelihood Inference For The Area Under The Roc Curve, Gengsheng Qin, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

For a continuous-scale diagnostic test, the most commonly used summary index of the receiver operating characteristic (ROC) curve is the area under the curve (AUC) that measures the accuracy of the diagnostic test. In this paper we propose an empirical likelihood approach for the inference of AUC. We first define an empirical likelihood ratio for AUC and show that its limiting distribution is a scaled chi-square distribution. We then obtain an empirical likelihood based confidence interval for AUC using the scaled chi-square distribution. This empirical likelihood inference for AUC can be extended to stratified samples and the resulting limiting distribution …


Interval Estimation For The Ratio And Difference Of Two Lognormal Means, Yea-Hung Chen, Xiao-Hua Zhou Dec 2005

Interval Estimation For The Ratio And Difference Of Two Lognormal Means, Yea-Hung Chen, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Health research often gives rise to data that follow lognormal distributions. In two sample situations, researchers are likely to be interested in estimating the difference or ratio of the population means. Several methods have been proposed for providing confidence intervals for these parameters. However, it is not clear which techniques are most appropriate, or how their performance might vary. Additionally, methods for the difference of means have not been adequately explored. We discuss in the present article five methods of analysis. These include two methods based on the log-likelihood ratio statistic and a generalized pivotal approach. Additionally, we provide and …


Inferences In Censored Cost Regression Models With Empirical Likelihood, Xiao-Hua Zhou, Gengsheng Qin, Huazhen Lin, Gang Li Dec 2005

Inferences In Censored Cost Regression Models With Empirical Likelihood, Xiao-Hua Zhou, Gengsheng Qin, Huazhen Lin, Gang Li

UW Biostatistics Working Paper Series

In many studies of health economics, we are interested in the expected total cost over a certain period for a patient with given characteristics. Problems can arise if cost estimation models do not account for distributional aspects of costs. Two such problems are 1) the skewed nature of the data and 2) censored observations. In this paper we propose an empirical likelihood (EL) method for constructing a confidence region for the vector of regression parameters and a confidence interval for the expected total cost of a patient with the given covariates. We show that this new method has good theoretical …


Confidence Intervals For Predictive Values Using Data From A Case Control Study, Nathaniel David Mercaldo, Xiao-Hua Zhou, Kit F. Lau Dec 2005

Confidence Intervals For Predictive Values Using Data From A Case Control Study, Nathaniel David Mercaldo, Xiao-Hua Zhou, Kit F. Lau

UW Biostatistics Working Paper Series

The accuracy of a binary-scale diagnostic test can be represented by sensitivity (Se), specificity (Sp) and positive and negative predictive values (PPV and NPV). Although Se and Sp measure the intrinsic accuracy of a diagnostic test that does not depend on the prevalence rate, they do not provide information on the diagnostic accuracy of a particular patient. To obtain this information we need to use PPV and NPV. Since PPV and NPV are functions of both the intrinsic accuracy and the prevalence of the disease, constructing confidence intervals for PPV and NPV for a particular patient in a population with …


Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey Nov 2005

Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey

UW Biostatistics Working Paper Series

Nearest centroid classifiers have recently been successfully employed in high-dimensional applications. A necessary step when building a classifier for high-dimensional data is feature selection. Feature selection is typically carried out by computing univariate statistics for each feature individually, without consideration for how a subset of features performs as a whole. For subsets of a given size, we characterize the optimal choice of features, corresponding to those yielding the smallest misclassification rate. Furthermore, we propose an algorithm for estimating this optimal subset in practice. Finally, we investigate the applicability of shrinkage ideas to nearest centroid classifiers. We use gene-expression microarrays for …


A New Approach To Intensity-Dependent Normalization Of Two-Channel Microarrays, Alan R. Dabney, John D. Storey Nov 2005

A New Approach To Intensity-Dependent Normalization Of Two-Channel Microarrays, Alan R. Dabney, John D. Storey

UW Biostatistics Working Paper Series

A two-channel microarray measures the relative expression levels of thousands of genes from a pair of biological samples. In order to reliably compare gene expression levels between and within arrays, it is necessary to remove systematic errors that distort the biological signal of interest. The standard for accomplishing this is smoothing "MA-plots" to remove intensity-dependent dye bias and array-specific effects. However, MA methods require strong assumptions. We review these assumptions and derive several practical scenarios in which they fail. The "dye-swap" normalization method has been much less frequently used because it requires two arrays per pair of samples. We show …