Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 137

Full-Text Articles in Physical Sciences and Mathematics

Alleviating Linear Ecological Bias And Optimal Design With Subsample Data, Adam Glynn, Jon Wakefield, Mark Handcock, Thomas Richardson Dec 2005

Alleviating Linear Ecological Bias And Optimal Design With Subsample Data, Adam Glynn, Jon Wakefield, Mark Handcock, Thomas Richardson

UW Biostatistics Working Paper Series

In this paper, we illustrate that combining ecological data with subsample data in situations in which a linear model is appropriate provides three main benefits. First, by including the individual level subsample data, the biases associated with linear ecological inference can be eliminated. Second, by supplementing the subsample data with ecological data, the information about parameters will be increased. Third, we can use readily available ecological data to design optimal subsampling schemes, so as to further increase the information about parameters. We present an application of this methodology to the classic problem of estimating the effect of a college degree …


Empirical Likelihood Inference For The Area Under The Roc Curve, Gengsheng Qin, Xiao-Hua Zhou Dec 2005

Empirical Likelihood Inference For The Area Under The Roc Curve, Gengsheng Qin, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

For a continuous-scale diagnostic test, the most commonly used summary index of the receiver operating characteristic (ROC) curve is the area under the curve (AUC) that measures the accuracy of the diagnostic test. In this paper we propose an empirical likelihood approach for the inference of AUC. We first define an empirical likelihood ratio for AUC and show that its limiting distribution is a scaled chi-square distribution. We then obtain an empirical likelihood based confidence interval for AUC using the scaled chi-square distribution. This empirical likelihood inference for AUC can be extended to stratified samples and the resulting limiting distribution …


Interval Estimation For The Ratio And Difference Of Two Lognormal Means, Yea-Hung Chen, Xiao-Hua Zhou Dec 2005

Interval Estimation For The Ratio And Difference Of Two Lognormal Means, Yea-Hung Chen, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Health research often gives rise to data that follow lognormal distributions. In two sample situations, researchers are likely to be interested in estimating the difference or ratio of the population means. Several methods have been proposed for providing confidence intervals for these parameters. However, it is not clear which techniques are most appropriate, or how their performance might vary. Additionally, methods for the difference of means have not been adequately explored. We discuss in the present article five methods of analysis. These include two methods based on the log-likelihood ratio statistic and a generalized pivotal approach. Additionally, we provide and …


Inferences In Censored Cost Regression Models With Empirical Likelihood, Xiao-Hua Zhou, Gengsheng Qin, Huazhen Lin, Gang Li Dec 2005

Inferences In Censored Cost Regression Models With Empirical Likelihood, Xiao-Hua Zhou, Gengsheng Qin, Huazhen Lin, Gang Li

UW Biostatistics Working Paper Series

In many studies of health economics, we are interested in the expected total cost over a certain period for a patient with given characteristics. Problems can arise if cost estimation models do not account for distributional aspects of costs. Two such problems are 1) the skewed nature of the data and 2) censored observations. In this paper we propose an empirical likelihood (EL) method for constructing a confidence region for the vector of regression parameters and a confidence interval for the expected total cost of a patient with the given covariates. We show that this new method has good theoretical …


Confidence Intervals For Predictive Values Using Data From A Case Control Study, Nathaniel David Mercaldo, Xiao-Hua Zhou, Kit F. Lau Dec 2005

Confidence Intervals For Predictive Values Using Data From A Case Control Study, Nathaniel David Mercaldo, Xiao-Hua Zhou, Kit F. Lau

UW Biostatistics Working Paper Series

The accuracy of a binary-scale diagnostic test can be represented by sensitivity (Se), specificity (Sp) and positive and negative predictive values (PPV and NPV). Although Se and Sp measure the intrinsic accuracy of a diagnostic test that does not depend on the prevalence rate, they do not provide information on the diagnostic accuracy of a particular patient. To obtain this information we need to use PPV and NPV. Since PPV and NPV are functions of both the intrinsic accuracy and the prevalence of the disease, constructing confidence intervals for PPV and NPV for a particular patient in a population with …


Issues Of Processing And Multiple Testing Of Seldi-Tof Ms Proteomic Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan, Christine F. Skibola, Christine M. Hegedus, Martyn T. Smith Dec 2005

Issues Of Processing And Multiple Testing Of Seldi-Tof Ms Proteomic Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan, Christine F. Skibola, Christine M. Hegedus, Martyn T. Smith

U.C. Berkeley Division of Biostatistics Working Paper Series

A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined technical repeats (2 per subject) of intensity versus m/z (mass/charge) of bone marrow cell lysate for two groups of childhood leukemia patients: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). As others have noted, the type of data processing as well as experimental variability can have a disproportionate impact on the list of "interesting" proteins (see Baggerly et al. (2004)). We propose a list of processing and multiple testing techniques to correct for 1) background drift; 2) filtering using smooth regression and cross-validated bandwidth …


Quantile-Function Based Null Distribution In Resampling Based Multiple Testing, Mark J. Van Der Laan, Alan E. Hubbard Nov 2005

Quantile-Function Based Null Distribution In Resampling Based Multiple Testing, Mark J. Van Der Laan, Alan E. Hubbard

U.C. Berkeley Division of Biostatistics Working Paper Series

Simultaneously testing a collection of null hypotheses about a data generating distribution based on a sample of independent and identically distributed observations is a fundamental and important statistical problem involving many applications. Methods based on marginal null distributions (i.e., marginal p-values) are attractive since the marginal p-values can be based on a user supplied choice of marginal null distributions and they are computationally trivial, but they, by necessity, are known to either be conservative or to rely on assumptions about the dependence structure between the test-statistics. Resampling based multiple testing (Westfall and Young, 1993) involves sampling from a joint null …


Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey Nov 2005

Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey

UW Biostatistics Working Paper Series

Nearest centroid classifiers have recently been successfully employed in high-dimensional applications. A necessary step when building a classifier for high-dimensional data is feature selection. Feature selection is typically carried out by computing univariate statistics for each feature individually, without consideration for how a subset of features performs as a whole. For subsets of a given size, we characterize the optimal choice of features, corresponding to those yielding the smallest misclassification rate. Furthermore, we propose an algorithm for estimating this optimal subset in practice. Finally, we investigate the applicability of shrinkage ideas to nearest centroid classifiers. We use gene-expression microarrays for …


A New Approach To Intensity-Dependent Normalization Of Two-Channel Microarrays, Alan R. Dabney, John D. Storey Nov 2005

A New Approach To Intensity-Dependent Normalization Of Two-Channel Microarrays, Alan R. Dabney, John D. Storey

UW Biostatistics Working Paper Series

A two-channel microarray measures the relative expression levels of thousands of genes from a pair of biological samples. In order to reliably compare gene expression levels between and within arrays, it is necessary to remove systematic errors that distort the biological signal of interest. The standard for accomplishing this is smoothing "MA-plots" to remove intensity-dependent dye bias and array-specific effects. However, MA methods require strong assumptions. We review these assumptions and derive several practical scenarios in which they fail. The "dye-swap" normalization method has been much less frequently used because it requires two arrays per pair of samples. We show …


A General Imputation Methodology For Nonparametric Regression With Censored Data, Dan Rubin, Mark J. Van Der Laan Nov 2005

A General Imputation Methodology For Nonparametric Regression With Censored Data, Dan Rubin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We consider the random design nonparametric regression problem when the response variable is subject to a general mode of missingness or censoring. A traditional approach to such problems is imputation, in which the missing or censored responses are replaced by well-chosen values, and then the resulting covariate/response data are plugged into algorithms designed for the uncensored setting. We present a general methodology for imputation with the property of double robustness, in that the method works well if either a parameter of the full data distribution (covariate and response distribution) or a parameter of the censoring mechanism is well approximated. These …


Estimating A Treatment Effect With Repeated Measurements Accounting For Varying Effectiveness Duration, Ying Qing Chen, Jingrong Yang, Su-Chun Cheng Nov 2005

Estimating A Treatment Effect With Repeated Measurements Accounting For Varying Effectiveness Duration, Ying Qing Chen, Jingrong Yang, Su-Chun Cheng

UW Biostatistics Working Paper Series

To assess treatment efficacy in clinical trials, certain clinical outcomes are repeatedly measured for same subject over time. They can be regarded as function of time. The difference in their mean functions between the treatment arms usually characterises a treatment effect. Due to the potential existence of subject-specific treatment effectiveness lag and saturation times, erosion of treatment effect in the difference may occur during the observation period of time. Instead of using ad hoc parametric or purely nonparametric time-varying coefficients in statistical modeling, we first propose to model the treatment effectiveness durations, which are the varying time intervals between the …


The Influence Of Reliability On Four Rules For Determining The Number Of Components To Retain, Gibbs Y. Kanyongo Nov 2005

The Influence Of Reliability On Four Rules For Determining The Number Of Components To Retain, Gibbs Y. Kanyongo

Journal of Modern Applied Statistical Methods

Imperfectly reliable scores impact the performance of factor analytic procedures. A series of Monte Carlo studies was conducted to generate scores with known component structure from population matrices with varying levels of reliability. The scores were submitted to four procedures: Kaiser rule, scree plot, parallel analysis, and modified Horn’s parallel analysis to find if each procedure accurately determines the number of components at the different reliability levels. The performance of each procedure was judged by the percentage of the number of times that the procedure was correct and the mean components that each procedure extracted in each cell. Generally, the …


Large Sample And Bootstrap Intervals For The Gamma Scale Parameter Based On Grouped Data, Ayman Baklizi, Amjad Al-Nasser Nov 2005

Large Sample And Bootstrap Intervals For The Gamma Scale Parameter Based On Grouped Data, Ayman Baklizi, Amjad Al-Nasser

Journal of Modern Applied Statistical Methods

Interval estimation of the scale parameter of the gamma distribution using grouped data is considered in this article. Exact intervals do not exist and approximate intervals are needed Recently, Chen and Mi (2001) proposed alternative approximate intervals. In this article, some bootstrap and jackknife type intervals are proposed. The performance of these intervals is investigated and compared. The results show that some of the suggested intervals have a satisfactory statistical performance in situations where the sample size is small with heavy proportion of censoring.


Bootstrap Intervals Of The Parameters Of Lognormal Distribution Using Power Rule Model And Accelerated Life Tests, Mohammed Al-Haj Ebrahem Nov 2005

Bootstrap Intervals Of The Parameters Of Lognormal Distribution Using Power Rule Model And Accelerated Life Tests, Mohammed Al-Haj Ebrahem

Journal of Modern Applied Statistical Methods

Assumed that the distribution of the lifetime of any unit follows a lognormal distribution with parameters μ and σ . Also, assume that the relationship between μ and the stress level V is given by the power rule model. Several types of bootstrap intervals of the parameters were studied and their performance was studied using simulations and compared in term of attainment of the nominal confidence level, symmetry of lower and upper error rates and the expected width. Conclusions and recommendations are given.


A Method For Analyzing Unreplicated Experiments Using Information On The Intraclass Correlation Coefficient, Jamis J. Perrett Nov 2005

A Method For Analyzing Unreplicated Experiments Using Information On The Intraclass Correlation Coefficient, Jamis J. Perrett

Journal of Modern Applied Statistical Methods

Many studies are performed on units that cannot be replicated; however, there is often an abundance of subsampling. By placing a reasonable upper bound on the intraclass correlation coefficient (ICC), it is possible to carry out classical tests of significance that have conservative levels of significance.


Jmasm24: Numerical Computing For Third-Order Power Method Polynomials (Excel), Todd C. Headrick Nov 2005

Jmasm24: Numerical Computing For Third-Order Power Method Polynomials (Excel), Todd C. Headrick

Journal of Modern Applied Statistical Methods

The power method polynomial transformation is a popular procedure used for simulating univariate and multivariate non-normal distributions. It requires software that solves simultaneous nonlinear equations. Potential users of the power method may not have access to commercial software packages (e.g., Mathematica, Fortran). Therefore, algorithms are presented in the more commonly available Excel 2003 spreadsheets. The algorithms solve for (1) coefficients for polynomials of order three, (2) intermediate correlations and Cholesky factorizations for multivariate data generation, and (3) the values of skew and kurtosis for determining if a transformation will produce a valid power method probability density function (pdf). The Excel …


Misconceptions Leading To Choosing The T Test Over The Wilcoxon Mann-Whitney Test For Shift In Location Parameter, Shlomo S. Sawilowsky Nov 2005

Misconceptions Leading To Choosing The T Test Over The Wilcoxon Mann-Whitney Test For Shift In Location Parameter, Shlomo S. Sawilowsky

Journal of Modern Applied Statistical Methods

There exist many misconceptions in choosing the t over the Wilcoxon Rank-Sum test when testing for shift. Examples are given in the following three groups: (1) false statement, (2) true premise, but false conclusion, and (3) true statement irrelevant in choosing between the t test and the Wilcoxon Rank Sum test.


Second-Order Accurate Inference On Simple, Partial, And Multiple Correlations, Robert J. Boik, Ben Haaland Nov 2005

Second-Order Accurate Inference On Simple, Partial, And Multiple Correlations, Robert J. Boik, Ben Haaland

Journal of Modern Applied Statistical Methods

This article develops confidence interval procedures for functions of simple, partial, and squared multiple correlation coefficients. It is assumed that the observed multivariate data represent a random sample from a distribution that possesses infinite moments, but there is no requirement that the distribution be normal. The coverage error of conventional one-sided large sample intervals decreases at rate 1√n as n increases, where n is an index of sample size. The coverage error of the proposed intervals decreases at rate 1/n as n increases. The results of a simulation study that evaluates the performance of the proposed intervals is …


Inference For P(Y, Vee Ming Ng Nov 2005

Inference For P(Y, Vee Ming Ng

Journal of Modern Applied Statistical Methods

Some tests and confidence bounds for the reliability parameter R=P(Y


An Alternative To Warner’S Randomized Response Model, Sat Gupta, Javid Shabbir Nov 2005

An Alternative To Warner’S Randomized Response Model, Sat Gupta, Javid Shabbir

Journal of Modern Applied Statistical Methods

A modification to Warner’s (1965) Randomized Response Model is suggested. The suggested model is more efficient than the original model.


Determining Parallel Analysis Criteria, Marley W. Watkins Nov 2005

Determining Parallel Analysis Criteria, Marley W. Watkins

Journal of Modern Applied Statistical Methods

Determining the number of factors to extract is a critical decision in exploratory factor analysis. Simulation studies have found the Parallel Analysis criterion to be accurate, but it is computationally intensive. Two freeware programs that implement Parallel Analysis on Macintosh and Windows operating systems are presented.


Change Point Estimation Of Bilevel Functions, Leming Qu, Yi-Cheng Tu Nov 2005

Change Point Estimation Of Bilevel Functions, Leming Qu, Yi-Cheng Tu

Journal of Modern Applied Statistical Methods

Reconstruction of a bilevel function such as a bar code signal in a partially blind deconvolution problem is an important task in industrial processes. Existing methods are based on either the local approach or the regularization approach with a total variation penalty. This article reformulated the problem explicitly in terms of change points of the 0-1 step function. The bilevel function is then reconstructed by solving the nonlinear least squares problem subject to linear inequality constraints, with starting values provided by the local extremas of the derivative of the convolved signal from discrete noisy data. Simulation results show a considerable …


Applications Of Some Improved Estimators In Linear Regression, B. M. Golam Kibria Nov 2005

Applications Of Some Improved Estimators In Linear Regression, B. M. Golam Kibria

Journal of Modern Applied Statistical Methods

The problem of estimation of the regression coefficients under multicollinearity situation for the restricted linear model is discussed. Some improve estimators are considered, including the unrestricted ridge regression estimator (URRE), restricted ridge regression estimator (RRRE), shrinkage restricted ridge regression estimator (SRRRE), preliminary test ridge regression estimator (PTRRE), and restricted Liu estimator (RLIUE). The were compared based on the sampling variance-covariance criterion. The RRRE dominates other ridge estimators when the restriction does or does not hold. A numerical example was provided. The RRRE performed equivalently or better than the RLIUE in the sense of having smaller sampling variance.


Simulation Of Non-Normal Autocorrelated Variables, H.E.T. Holgersson Nov 2005

Simulation Of Non-Normal Autocorrelated Variables, H.E.T. Holgersson

Journal of Modern Applied Statistical Methods

All statistical methods rely on assumptions to some extent. Two assumptions frequently met in statistical analyses are those of normal distribution and independence. When examining robustness properties of such assumptions by Monte Carlo simulations it is therefore crucial that the possible effects of autocorrelation and non-normality are not confounded so that their separate effects may be investigated. This article presents a number of non-normal variables with non-confounded autocorrelation, thus allowing the analyst to specify autocorrelation or shape properties while keeping the other effect fixed.


Interval Estimation Of Risk Difference In Simple Compliance Randomized Trials, Kung-Jong Lui Nov 2005

Interval Estimation Of Risk Difference In Simple Compliance Randomized Trials, Kung-Jong Lui

Journal of Modern Applied Statistical Methods

Consider the simple compliance randomized trial, in which patients randomly assigned to the experimental treatment may switch to receive the standard treatment, while patients randomly assigned to the standard treatment are all assumed to receive their assigned treatment. Six asymptotic interval estimators for the risk difference in probabilities of response among patients who would accept the experimental treatment were developed. Monte Carlo methods were employed to evaluate and compare the finite-sample performance of these estimators. An example studying the effect of vitamin A supplementation on reducing mortality in preschool children was included to illustrate their practical use.


Ab/Ba Crossover Trials - Binary Outcome, James F. Reed Iii Nov 2005

Ab/Ba Crossover Trials - Binary Outcome, James F. Reed Iii

Journal of Modern Applied Statistical Methods

On occasion, the response to treatment in an AB/BA crossover trial is measured on a binary variable - success or failure. It is assumed that response to treatment is measured on an outcome variable with (+) representing a treatment success and a (-) representing a treatment failure. Traditionally, three tests for comparing treatment effect have been used (McNemar’s, Mainland-Gart, and Prescott’s). An issue arises concerning treatment comparisons when there may be a residual effect (carryover effect) of a previous treatment affecting the current treatment. A general consensus as to which procedure is preferable is debatable. However, if both group and …


A Robust Exponentially Weighted Moving Average Control Chart For The Process Mean, Michael B. C. Khoo, S. Y. Sim Nov 2005

A Robust Exponentially Weighted Moving Average Control Chart For The Process Mean, Michael B. C. Khoo, S. Y. Sim

Journal of Modern Applied Statistical Methods

To date, numerous extensions of the exponentially weighted moving average, EWMA charts have been made. A new robust EWMA chart for the process mean is proposed. It enables easier detection of outliers and increase sensitivity to other forms of out-of-control situation when outliers are present.


Correlation Between The Number Of Epileptic And Healthy Children In Family Size That Follows A Size-Biased Modified Power Series Distribution, Ramalingam Shanmugam, Anwar Hassan, Peer Bilal Ahmad Nov 2005

Correlation Between The Number Of Epileptic And Healthy Children In Family Size That Follows A Size-Biased Modified Power Series Distribution, Ramalingam Shanmugam, Anwar Hassan, Peer Bilal Ahmad

Journal of Modern Applied Statistical Methods

An expression for the correlation between the random number of epileptic and healthy children in family whose size follows a size-biased Modified Power Series Distribution (SBMPSD) is obtained and illustrated. As special cases, results are extracted for size biased Modified Negative Binomial Distribution (SBGNBD), size biased Modified Poisson Distribution (SBGPD) and size biased Modified Logarithmic Series Distribution (SBGLSD).


Corrections For Type I Error In Social Science Research: A Disconnect Between Theory And Practice, Kenneth Lachlan, Patric R. Spence Nov 2005

Corrections For Type I Error In Social Science Research: A Disconnect Between Theory And Practice, Kenneth Lachlan, Patric R. Spence

Journal of Modern Applied Statistical Methods

Type I errors are a common problem in factorial ANOVA and ANOVA based analyses. Despite decades of literature offering solutions to the Type I error problems associated with multiple significance tests, simple solutions such as Bonferroni corrections have been largely ignored by social scientists. To examine this discontinuity between theory and practice, a content analysis was performed on 5 flagship social science journals. Results indicate that corrections for Type I error are seldom utilized, even in designs so complicated as to almost guarantee erroneous rejection of null hypotheses.


Interaction Graphs For 4R2N-P Fractional Factorial Designs, M. L. Aggarwal, S. Roy Chowdhury, Anita Bansal, Neena Mital Nov 2005

Interaction Graphs For 4R2N-P Fractional Factorial Designs, M. L. Aggarwal, S. Roy Chowdhury, Anita Bansal, Neena Mital

Journal of Modern Applied Statistical Methods

Interaction graphs have been developed for two-level and three-level fractional factorial designs under different design criteria. A catalogue is presented of all possible non-isomorphic interaction graphs for 4r2n-p (r=1; n=2,…, 10; p=1,…,8 and r=2; n=1,…, 7; p=1,…,7) fractional factorial designs, and nonisomorphic interaction graphs for asymmetric fractional factorial designs under the concept of combined array.