Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2005

Statistics and Probability

Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 328

Full-Text Articles in Physical Sciences and Mathematics

Semiparametric Approaches For Joint Modeling Of Longitudinal And Survival Data With Time Varying Coefficients, Xiao Song, C.Y. Wang Dec 2005

Semiparametric Approaches For Joint Modeling Of Longitudinal And Survival Data With Time Varying Coefficients, Xiao Song, C.Y. Wang

UW Biostatistics Working Paper Series

We study joint modeling of survival and longitudinal data. There are two regression models of interest. The primary model is for survival outcomes, which are assumed to follow a time varying coefficient proportional hazards model. The second model is for longitudinal data, which are assumed to follow a random effects model. Based on the trajectory of a subject's longitudinal data, some covariates in the survival model are functions of the unobserved random effects. Estimated random effects are generally different from the unobserved random effects and hence this leads to covariate measurement error. To deal with covariate measurement error, we propose …


Nonparametric Control Chart For The Range, Arnold J. Stromberg Dec 2005

Nonparametric Control Chart For The Range, Arnold J. Stromberg

Statistics Faculty Patents

A method is provided for detecting or predicting an undesired deviation in variability of at least one parameter being monitored, wherein the variation in the parameter is incrementally recorded. The method comprises establishing the number of subsets of a dataset that have a range of the difference between any two datapoints within the dataset, and computing a control chart for the range based thereon. The method accurately detects changes in variability in real time. The true distribution of the data is reflected, and the desired result is achieved without requiring an inordinate number of computations.


Foreign Migration To The Cleveland-Akron-Lorain Metropolitan Area From 1995 To 2000, Mark Salling, Ellen Cyran Dec 2005

Foreign Migration To The Cleveland-Akron-Lorain Metropolitan Area From 1995 To 2000, Mark Salling, Ellen Cyran

All Maxine Goodman Levin School of Urban Affairs Publications

This report is one of a series on migration to and from the region using the five percent Public Use Microdata Sample (PUMS) of the 2000 Census of Population and Housing and provides a description of foreign migrants moving to the Cleveland-Akron-Lorain (CAL) Consolidated Metropolitan Area (CMSA) from 1995 to 2000.* The report identifies the countries of origin of migrants and compares the demographic, socioeconomic, and housing characteristics of the foreign migrants to the CAL with other groups, including foreign migrants to Ohio and the nation, and, at times, to domestic migrants to and from the CAL.


Alleviating Linear Ecological Bias And Optimal Design With Subsample Data, Adam Glynn, Jon Wakefield, Mark Handcock, Thomas Richardson Dec 2005

Alleviating Linear Ecological Bias And Optimal Design With Subsample Data, Adam Glynn, Jon Wakefield, Mark Handcock, Thomas Richardson

UW Biostatistics Working Paper Series

In this paper, we illustrate that combining ecological data with subsample data in situations in which a linear model is appropriate provides three main benefits. First, by including the individual level subsample data, the biases associated with linear ecological inference can be eliminated. Second, by supplementing the subsample data with ecological data, the information about parameters will be increased. Third, we can use readily available ecological data to design optimal subsampling schemes, so as to further increase the information about parameters. We present an application of this methodology to the classic problem of estimating the effect of a college degree …


Bayesian Analysis Of Cell-Cycle Gene Expression Data, Chuan Zhou, Jon Wakefield, Linda Breeden Dec 2005

Bayesian Analysis Of Cell-Cycle Gene Expression Data, Chuan Zhou, Jon Wakefield, Linda Breeden

UW Biostatistics Working Paper Series

The study of the cell-cycle is important in order to aid in our understanding of the basic mechanisms of life, yet progress has been slow due to the complexity of the process and our lack of ability to study it at high resolution. Recent advances in microarray technology have enabled scientists to study the gene expression at the genome-scale with a manageable cost, and there has been an increasing effort to identify cell-cycle regulated genes. In this chapter, we discuss the analysis of cell-cycle gene expression data, focusing on a model-based Bayesian approaches. The majority of the models we describe …


Empirical Likelihood Inference For The Area Under The Roc Curve, Gengsheng Qin, Xiao-Hua Zhou Dec 2005

Empirical Likelihood Inference For The Area Under The Roc Curve, Gengsheng Qin, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

For a continuous-scale diagnostic test, the most commonly used summary index of the receiver operating characteristic (ROC) curve is the area under the curve (AUC) that measures the accuracy of the diagnostic test. In this paper we propose an empirical likelihood approach for the inference of AUC. We first define an empirical likelihood ratio for AUC and show that its limiting distribution is a scaled chi-square distribution. We then obtain an empirical likelihood based confidence interval for AUC using the scaled chi-square distribution. This empirical likelihood inference for AUC can be extended to stratified samples and the resulting limiting distribution …


Interval Estimation For The Ratio And Difference Of Two Lognormal Means, Yea-Hung Chen, Xiao-Hua Zhou Dec 2005

Interval Estimation For The Ratio And Difference Of Two Lognormal Means, Yea-Hung Chen, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Health research often gives rise to data that follow lognormal distributions. In two sample situations, researchers are likely to be interested in estimating the difference or ratio of the population means. Several methods have been proposed for providing confidence intervals for these parameters. However, it is not clear which techniques are most appropriate, or how their performance might vary. Additionally, methods for the difference of means have not been adequately explored. We discuss in the present article five methods of analysis. These include two methods based on the log-likelihood ratio statistic and a generalized pivotal approach. Additionally, we provide and …


Inferences In Censored Cost Regression Models With Empirical Likelihood, Xiao-Hua Zhou, Gengsheng Qin, Huazhen Lin, Gang Li Dec 2005

Inferences In Censored Cost Regression Models With Empirical Likelihood, Xiao-Hua Zhou, Gengsheng Qin, Huazhen Lin, Gang Li

UW Biostatistics Working Paper Series

In many studies of health economics, we are interested in the expected total cost over a certain period for a patient with given characteristics. Problems can arise if cost estimation models do not account for distributional aspects of costs. Two such problems are 1) the skewed nature of the data and 2) censored observations. In this paper we propose an empirical likelihood (EL) method for constructing a confidence region for the vector of regression parameters and a confidence interval for the expected total cost of a patient with the given covariates. We show that this new method has good theoretical …


Confidence Intervals For Predictive Values Using Data From A Case Control Study, Nathaniel David Mercaldo, Xiao-Hua Zhou, Kit F. Lau Dec 2005

Confidence Intervals For Predictive Values Using Data From A Case Control Study, Nathaniel David Mercaldo, Xiao-Hua Zhou, Kit F. Lau

UW Biostatistics Working Paper Series

The accuracy of a binary-scale diagnostic test can be represented by sensitivity (Se), specificity (Sp) and positive and negative predictive values (PPV and NPV). Although Se and Sp measure the intrinsic accuracy of a diagnostic test that does not depend on the prevalence rate, they do not provide information on the diagnostic accuracy of a particular patient. To obtain this information we need to use PPV and NPV. Since PPV and NPV are functions of both the intrinsic accuracy and the prevalence of the disease, constructing confidence intervals for PPV and NPV for a particular patient in a population with …


Model Checking For Roc Regression Analysis, Tianxi Cai, Yingye Zheng Dec 2005

Model Checking For Roc Regression Analysis, Tianxi Cai, Yingye Zheng

Harvard University Biostatistics Working Paper Series

The Receiver Operating Characteristic (ROC) curve is a prominent tool for characterizing the accuracy of continuous diagnostic test. To account for factors that might invluence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date practical model checking techniques suitable for validating existing ROC regression models are not yet available. In this paper, we develop cumulative residual based procedures to graphically and numerically assess the goodness-of-fit for some commonly used ROC regression models, and …


On The Use Of Non-Euclidean Isotropy In Geostatistics, Frank C. Curriero Dec 2005

On The Use Of Non-Euclidean Isotropy In Geostatistics, Frank C. Curriero

Johns Hopkins University, Dept. of Biostatistics Working Papers

This paper investigates the use of non-Euclidean distances to characterize isotropic spatial dependence for geostatistical related applications. A simple example is provided to demonstrate there are no guarantees that existing covariogram and variogram functions remain valid (i.e.\ positive definite or conditionally negative definite) when used with a non-Euclidean distance measure. Furthermore, satisfying the conditions of a metric is not sufficient to ensure the distance measure can be used with existing functions. Current literature is not clear on these topics. There are certain distance measures that when used with existing covariogram and variogram functions remain valid, an issue that is explored. …


Autologous Stem Cell Transplant: Factors Predicting The Yield Of Cd34+ Cells, Elizabeth Anne Lawson Dec 2005

Autologous Stem Cell Transplant: Factors Predicting The Yield Of Cd34+ Cells, Elizabeth Anne Lawson

Theses and Dissertations

Stem cell transplant is often considered the last hope for the survival for many cancer patients. The CD34+ cell content of a collection of stem cells has appeared as the most reliable indicator of the quantity of desired cells in a peripheral blood stem cell harvest and is used as a surrogate measure of the sample quality. Factors predicting the yield of CD34+ cells in a collection are not yet fully understood. Throughout the literature, there has been conflicting evidence with regards to age, gender, disease status, and prior radiation. In addition to the factors that have already been explored, …


Expert Testimony In Capital Sentencing: Juror Responses, John H. Montgomery, J. Richard Ciccone, Stephen P. Garvey, Theodore Eisenberg Dec 2005

Expert Testimony In Capital Sentencing: Juror Responses, John H. Montgomery, J. Richard Ciccone, Stephen P. Garvey, Theodore Eisenberg

Cornell Law Faculty Publications

The U.S. Supreme Court, in Furman v. Georgia (1972), held that the death penalty is constitutional only when applied on an individualized basis. The resultant changes in the laws in death penalty states fostered the involvement of psychiatric and psychologic expert witnesses at the sentencing phase of the trial, to testify on two major issues: (1) the mitigating factor of a defendant’s abnormal mental state and (2) the aggravating factor of a defendant’s potential for future violence. This study was an exploration of the responses of capital jurors to psychiatric/psychologic expert testimony during capital sentencing. The Capital Jury Project is …


Improved Peak Detection And Quantification Of Mass Spectrometry Data Acquired From Surface-Enhanced Laser Desorption And Ionization By Denoising Spectra With The Undecimated Discrete Wavelet Transform, Kevin R. Coombes, Spiros Tsavachidis, Jeffrey S. Morris, Keith A. Baggerly, Henry M. Kuerer Dec 2005

Improved Peak Detection And Quantification Of Mass Spectrometry Data Acquired From Surface-Enhanced Laser Desorption And Ionization By Denoising Spectra With The Undecimated Discrete Wavelet Transform, Kevin R. Coombes, Spiros Tsavachidis, Jeffrey S. Morris, Keith A. Baggerly, Henry M. Kuerer

Jeffrey S. Morris

Background: Mass spectrometry, especially surface enhanced laser desorption and ionization (SELDI) is increasingly being used to find disease-related proteomic patterns in complex mixtures of proteins derived from tissue samples or from easily obtained biological fluids such as serum, urine, or nipple aspirate fluid. Questions have been raised about the reproducibility and reliability of peak quantifications using this technology. For example, Yasui and colleagues opted to replace continuous measures of the size of a peak by a simple binary indicator of its presence or absence in their analysis of a set of spectra from prostate cancer patients.

Methods: We collected nipple …


Pooling Information Across Different Studies And Oligonucleotide Microarray Chip Types To Identify Prognostic Genes For Lung Cancer., Jeffrey S. Morris, Guosheng Yin, Keith A. Baggerly, Chunlei Wu, Li Zhang Dec 2005

Pooling Information Across Different Studies And Oligonucleotide Microarray Chip Types To Identify Prognostic Genes For Lung Cancer., Jeffrey S. Morris, Guosheng Yin, Keith A. Baggerly, Chunlei Wu, Li Zhang

Jeffrey S. Morris

Our goal in this work is to pool information across microarray studies conducted at different institutions using two different versions of Affymetrix chips to identify genes whose expression levels offer information on lung cancer patients’ survival above and beyond the information provided by readily available clinical covariates. We combine information across chip types by identifying “matching probes” present on both chips, and then assembling them into new probesets based on Unigene clusters. This method yields comparable expression level quantifications across chips without sacrificing much precision or significantly altering the relative ordering of the samples. We fit a series of multivariable …


Accounting For Missing Data In End-Of-Life Research, Paula Diehr, Laura Lee Johnson Dec 2005

Accounting For Missing Data In End-Of-Life Research, Paula Diehr, Laura Lee Johnson

Paula Diehr

End-of-life studies are likely to have missing data because sicker persons are less likely to provide information and because measurements cannot be made after death. Ignoring missing data may result in data that are too favorable, because the sickest persons are effectively dropped from the analysis. In a comparison of two groups, the group with the most deaths and missing data will tend to have the most favorable data, which is not desirable. Results based on only the available data may not be generalizable to the original study population. If most of the missing data are absent because of death, …


Gradient Directed Regularization For Sparse Gaussian Concentration Graphs, With Applications To Inference Of Genetic Networks, Hongzhe Li, Jiang Gui Dec 2005

Gradient Directed Regularization For Sparse Gaussian Concentration Graphs, With Applications To Inference Of Genetic Networks, Hongzhe Li, Jiang Gui

UPenn Biostatistics Working Papers

Large-scale microarray gene expression data provide the possibility of constructing genetic networks or biological pathways. Gaussian graphical models have been suggested to provide an effective method for constructing such genetic networks. However, most of the available methods for constructing Gaussian graphs do not account for the sparsity of the networks and are computationally more demanding or infeasible, especially in the settings of high-dimension and low sample size. We introduce a threshold gradient descent regularization procedure for estimating the sparse precision matrix in the setting of Gaussian graphical models and demonstrate its application to identifying genetic networks. Such a procedure is …


Obesity, Self-Complexity, And Compartmentalization: On The Implications Of Obesity For Self-Concept Organization, Bruce E. Blaine, C. E. Johnson Dec 2005

Obesity, Self-Complexity, And Compartmentalization: On The Implications Of Obesity For Self-Concept Organization, Bruce E. Blaine, C. E. Johnson

Statistics Faculty/Staff Publications

The relationship between obesity and structural aspects of the self-concept was examined in adult women. Participants were 119 adult women [age range: 18-73, M=26.9; body mass index (BMI) range: 16.2-54.7, M=27.3] who completed measures of self-esteem, self-complexity, and the spontaneous self-concept. BMI was associated with less complex and more compartmentalized self-knowledge and more frequent mention of weight-stereotypic traits as self-descriptive. The findings are discussed in the context of research on obesity- related stigma.


Distributed Blowing And Suction For The Purpose Of Streak Control In A Boundary Layer Subjected To A Favorable Pressure Gradient, Eric Forgoston, Anatoli Tumin, David E. Ashpis Dec 2005

Distributed Blowing And Suction For The Purpose Of Streak Control In A Boundary Layer Subjected To A Favorable Pressure Gradient, Eric Forgoston, Anatoli Tumin, David E. Ashpis

Department of Applied Mathematics and Statistics Faculty Scholarship and Creative Works

An analysis of the optimal control by blowing and suction in order to generate streamwise velocity streaks is presented. The problem is examined using an iterative process that employs the Parabolized Stability Equations for an incompressible fluid along with its adjoint equations. In particular, distributions of blowing and suction are computed for both the normal and tangential velocity perturbations for various choices of parameters.


Three-Dimensional Wave Packet In A Hypersonic Boundary Layer, Eric Forgoston, Anatoli Tumin Dec 2005

Three-Dimensional Wave Packet In A Hypersonic Boundary Layer, Eric Forgoston, Anatoli Tumin

Department of Applied Mathematics and Statistics Faculty Scholarship and Creative Works

A three-dimensional wave packet generated by a local disturbance in a hypersonic boundary layer flow is studied with the aid of the previously solved initial-value problem. The solution to this problem can be expanded in a biorthogonal eigenfunction system as a sum of discrete and continuous modes. A specific disturbance consisting of an initial temperature spot is considered, and the receptivity to this initial temperature spot is computed for both the two-dimensional and three-dimensional cases. Using previous analysis of the discrete and continuous spectrum, we numerically compute the inverse Fourier transform. The two-dimensional inverse Fourier transform is found for Mode …


Issues Of Processing And Multiple Testing Of Seldi-Tof Ms Proteomic Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan, Christine F. Skibola, Christine M. Hegedus, Martyn T. Smith Dec 2005

Issues Of Processing And Multiple Testing Of Seldi-Tof Ms Proteomic Data, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan, Christine F. Skibola, Christine M. Hegedus, Martyn T. Smith

U.C. Berkeley Division of Biostatistics Working Paper Series

A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined technical repeats (2 per subject) of intensity versus m/z (mass/charge) of bone marrow cell lysate for two groups of childhood leukemia patients: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). As others have noted, the type of data processing as well as experimental variability can have a disproportionate impact on the list of "interesting" proteins (see Baggerly et al. (2004)). We propose a list of processing and multiple testing techniques to correct for 1) background drift; 2) filtering using smooth regression and cross-validated bandwidth …


The Stochastic Dance Of Early Hiv Infection, Stephen J. Merrill Dec 2005

The Stochastic Dance Of Early Hiv Infection, Stephen J. Merrill

Mathematics, Statistics and Computer Science Faculty Research and Publications

The stochastic nature of early HIV infection is described in a series of models, each of which captures aspects of the dance of HIV during the early stages of infection. It is to this highly variable target that the immune response must respond. The adaptability of the various components of the immune response is an important aspect of the system's operation, as the nature of the pathogens that the response will be required to respond to and the order in which those responses must be made cannot be known beforehand. As HIV infection has direct influence over cells responsible for …


A Reliability Case Study On Estimating Extremely Small Percentiles Of Strength Data For The Continuous Improvement Of Medium Density Fiberboard Product Quality, Weiwei Chen Dec 2005

A Reliability Case Study On Estimating Extremely Small Percentiles Of Strength Data For The Continuous Improvement Of Medium Density Fiberboard Product Quality, Weiwei Chen

Masters Theses

The objective of this thesis is to better estimate extremely small percentiles of strength distributions for measuring failure process in continuous improvement initiatives. These percentiles are of great interest for companies, oversight organizations, and consumers concerned with product safety and reliability. The thesis investigates the lower percentiles for the quality of medium density fiberboard (MDF). The international industrial standard for measuring quality for MDF is internal bond (IB, a tensile strength test). The results of the thesis indicated that the smaller percentiles are crucial, especially the first percentile and lower ones.

The thesis starts by introducing the background, study objectives, …


Accuracy Of The Newtom 3g™ In Measuring The Angle Of The Articular Eminence, Rehana Khan Dec 2005

Accuracy Of The Newtom 3g™ In Measuring The Angle Of The Articular Eminence, Rehana Khan

Loma Linda University Electronic Theses, Dissertations & Projects

The purpose of this study was to determine the accuracy of the Newtom 3G™ in determining the angulation of the articular eminence. The benefits of conducting this study were to provide additional uses for the standard records that are taken for the purposes of orthodontic treatment, as well as evaluate the Newtom 3G™ for accuracy in measuring the anatomy of the glenoid fossa. This study required 20 participants that volunteered to allow their records to be used. Records evaluated were the Newtom 3G™, impressions, and wax check bite registrations. The wax record was taken using the 'forced bite' technique to …


Autism And Parental Marital Satisfaction: The Role Of Adequacy Of Resources, Geneeta Kaliah Chambers Dec 2005

Autism And Parental Marital Satisfaction: The Role Of Adequacy Of Resources, Geneeta Kaliah Chambers

Loma Linda University Electronic Theses, Dissertations & Projects

The goal of the present study was to expand on the existing literature exploring families with children who have developmental disabilities, particularly autism. Previous studies have been constrained by univariate approaches that have failed to adequately capture the nuances of family functioning. Using an ecological/context approach, stemming from an ongoing research program conducted within a university-based treatment center, the present study attempted to improve on the conceptualization of interrelationships among family members and the role that contextual factors play within that dynamic. Specifically, the present study explored the influence of children’s level of autism on parents’ reports of their marital …


Biosecurity And The Role Of Statisticians, Ron Brookmeyer Nov 2005

Biosecurity And The Role Of Statisticians, Ron Brookmeyer

Ron Brookmeyer

No abstract provided.


Thursday Test, Sid Twentythree Nov 2005

Thursday Test, Sid Twentythree

Sidney Twentythree Sr.

Another test. This really is another test.


Quantile-Function Based Null Distribution In Resampling Based Multiple Testing, Mark J. Van Der Laan, Alan E. Hubbard Nov 2005

Quantile-Function Based Null Distribution In Resampling Based Multiple Testing, Mark J. Van Der Laan, Alan E. Hubbard

U.C. Berkeley Division of Biostatistics Working Paper Series

Simultaneously testing a collection of null hypotheses about a data generating distribution based on a sample of independent and identically distributed observations is a fundamental and important statistical problem involving many applications. Methods based on marginal null distributions (i.e., marginal p-values) are attractive since the marginal p-values can be based on a user supplied choice of marginal null distributions and they are computationally trivial, but they, by necessity, are known to either be conservative or to rely on assumptions about the dependence structure between the test-statistics. Resampling based multiple testing (Westfall and Young, 1993) involves sampling from a joint null …


Testing Primitive Polynomials For Generalized Feedback Shift Register Random Number Generators, Guinan Lian Nov 2005

Testing Primitive Polynomials For Generalized Feedback Shift Register Random Number Generators, Guinan Lian

Theses and Dissertations

The class of generalized feedback shift register (GFSR) random number generators was a promising method for random number generation in the 1980's, but was abandoned because of some flaws such as poor performance on certain tests for randomness. The poor performance may be due to the choice of primitive polynomials used in the generators, rather than inherent flaws in the method. The original GFSR generators were all based on primitive trinomials. This project examines several alternative choices of primitive polynomials with more than one "interior" term to address this problem and hopefully provide access to good random number generators.


Data Adaptive Pathway Testing, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan Nov 2005

Data Adaptive Pathway Testing, Merrill D. Birkner, Alan E. Hubbard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

A majority of diseases are caused by a combination of factors, for example, composite genetic mutation profiles have been found in many cases to predict a deleterious outcome. There are several statistical techniques that have been used to analyze these types of biological data. This article implements a general strategy which uses data adaptive regression methods to build a specific pathway model, thus predicting a disease outcome by a combination of biological factors and assesses the significance of this model, or pathway, by using a permutation based null distribution. We also provide several simulation comparisons with other techniques. In addition, …