Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 27 of 27

Full-Text Articles in Physical Sciences and Mathematics

Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman Aug 2023

Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman

Electronic Theses and Dissertations

This dissertation consists of three research projects on cancer screening probability modeling. In these projects, the three key modeling parameters (sensitivity, sojourn time, transition density) for cancer screening were estimated, along with the long-term outcomes (including overdiagnosis as one outcome), the optimal screening time/age, the lead time distribution, and the probability of overdiagnosis at the future screening time were simulated to provide a statistical perspective on the effectiveness of cancer screening programs. In the first part of this dissertation, a statistical inference was conducted for male and female smokers using the National Lung Screening Trial (NLST) chest X-ray data. A …


Group Testing Identification: Objective Functions, Implementation, And Multiplex Assays, Brianna D. Hitt Apr 2020

Group Testing Identification: Objective Functions, Implementation, And Multiplex Assays, Brianna D. Hitt

Department of Statistics: Dissertations, Theses, and Student Work

Group testing is the process of combining items into groups to test for a binary characteristic. One of its most widely used applications is infectious disease testing. In this context, specimens (e.g., blood, urine) are amalgamated into groups and tested. For groups that test positive, there are many algorithmic retesting procedures available to identify positive individuals. The appeal of group testing is that the overall number of tests needed is significantly less than for individual testing when disease prevalence is small and an appropriate algorithm is chosen. Group testing has a number of applications beyond infectious disease testing, such as …


The Incubation Period Of Coronavirus Disease 2019 (Covid-19) From Publicly Reported Confirmed Cases: Estimation And Application, Stephen A. Lauer, Kyra H. Grantz, Qifang Bi, Forest K. Jones, Qulu Zheng, Hannah R. Meredith, Andrew S. Azman, Nicholas G. Reich, Justin Lessler Jan 2020

The Incubation Period Of Coronavirus Disease 2019 (Covid-19) From Publicly Reported Confirmed Cases: Estimation And Application, Stephen A. Lauer, Kyra H. Grantz, Qifang Bi, Forest K. Jones, Qulu Zheng, Hannah R. Meredith, Andrew S. Azman, Nicholas G. Reich, Justin Lessler

Biostatistics and Epidemiology Faculty Publications Series

Background:

A novel human coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was identified in China in December 2019. There is limited support for many of its key epidemiologic features, including the incubation period for clinical disease (coronavirus disease 2019 [COVID-19]), which has important implications for surveillance and control activities.

Objective:

To estimate the length of the incubation period of COVID-19 and describe its public health implications.

Design:

Pooled analysis of confirmed COVID-19 cases reported between 4 January 2020 and 24 February 2020.

Setting:

News reports and press releases from 50 provinces, regions, and countries outside Wuhan, Hubei province, China. …


Estimation Of The Three Key Parameters And The Lead Time Distribution In Lung Cancer Screening., Ruiqi Liu Aug 2017

Estimation Of The Three Key Parameters And The Lead Time Distribution In Lung Cancer Screening., Ruiqi Liu

Electronic Theses and Dissertations

This dissertation contains three research projects on cancer screening probability modeling. Cancer screening is the primary technique for early detection. The goal of screening is to catch the disease early before clinical symptoms appear. In these projects, the three key parameters and lead time distribution were estimated to provide a statistical point of view on the effectiveness of cancer screening programs. In the first project, cancer screening probability model was used to analyze the computed tomography (CT) scan group in the National Lung Screening Trial (NLST) data. Three key parameters were estimated using Bayesian approach and Markov Chain Monte Carlo …


Combining Biomarkers By Maximizing The True Positive Rate For A Fixed False Positive Rate, Allison Meisner, Marco Carone, Margaret Pepe, Kathleen F. Kerr Jul 2017

Combining Biomarkers By Maximizing The True Positive Rate For A Fixed False Positive Rate, Allison Meisner, Marco Carone, Margaret Pepe, Kathleen F. Kerr

UW Biostatistics Working Paper Series

Biomarkers abound in many areas of clinical research, and often investigators are interested in combining them for diagnosis, prognosis and screening. In many applications, the true positive rate for a biomarker combination at a prespecified, clinically acceptable false positive rate is the most relevant measure of predictive capacity. We propose a distribution-free method for constructing biomarker combinations by maximizing the true positive rate while constraining the false positive rate. Theoretical results demonstrate good operating characteristics for the resulting combination. In simulations, the biomarker combination provided by our method demonstrated improved operating characteristics in a variety of scenarios when compared with …


Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Oct 2016

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

Laura B. Balzer

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the …


Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Oct 2016

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the …


An Investigation Of Sensitivity Of An F Test In Locating Change Points In Linear Regression, Jing Sun Jan 2014

An Investigation Of Sensitivity Of An F Test In Locating Change Points In Linear Regression, Jing Sun

Electronic Theses and Dissertations

Change point is a statistic phenomenon, which has many direct applications in climatology, bioinformatics, finance, oceanography and medical imaging. In this thesis, we investigate the sensitivity of the F-test for detecting change points in linear regression, using a two-phase linear regression model. it offers an effective method to detect "undocumented" change points using a form of an F-test. Using simulated data, we explore its sensitivity and accuracy with respect t different parameters in the model.


Testing The Assumption Of Non-Differential Misclassification In Case-Control Studies, Tze-San Lee, Qin Hui Nov 2013

Testing The Assumption Of Non-Differential Misclassification In Case-Control Studies, Tze-San Lee, Qin Hui

Journal of Modern Applied Statistical Methods

One of the not yet solved issues regarding the misclassification in case-control studies is whether the misclassification rates are the same for both cases and controls. Currently, a common practice is to assume that the rates are the same, that is, the non-differential misclassification assumption. However, it has been suspected that this assumption may not be valid in practical applications. Unfortunately, no test is available so far to test the validity of the non-differential misclassification assumption. A method is presented to test the validity of non-differential misclassification assumption in case-control studies with 2 × 2 tables when validation data are …


Borrowing Information Across Populations In Estimating Positive And Negative Predictive Values, Ying Huang, Youyi Fong, John Wei, Ziding Feng Oct 2012

Borrowing Information Across Populations In Estimating Positive And Negative Predictive Values, Ying Huang, Youyi Fong, John Wei, Ziding Feng

UW Biostatistics Working Paper Series

A marker's capacity to predict risk of a disease depends on disease prevalence in the target population and its classification accuracy, i.e. its ability to discriminate diseased subjects from non-diseased subjects. The latter is often considered an intrinsic property of the marker; it is independent of disease prevalence and hence more likely to be similar across populations than risk prediction measures. In this paper, we are interested in evaluating the population-specific performance of a risk prediction marker in terms of positive predictive value (PPV) and negative predictive value (NPV) at given thresholds, when samples are available from the target population …


Fisher’S Exact Test For Misclassified Data, Tze-San Lee May 2011

Fisher’S Exact Test For Misclassified Data, Tze-San Lee

Journal of Modern Applied Statistical Methods

Ecole Supérieure de Commerce de Tunis. Fisher’s exact test is adapted to handle the misclassified data arising from comparing two binomial populations. The bias-adjusted odds ratio is proposed to account for misclassification errors. Its expected power depends in a nonlinear way on the true sensitivity and specificity of the classification method. The data taken from the no conviction rate of criminality for two types of twin populations was used to illustrate how to calculate true sensitivity and specificity and the expected power of the adjusted odds ratio.


Cost And Accuracy Comparisons In Medical Testing Using Sequential Testing Strategies, Anwar Ahmed May 2010

Cost And Accuracy Comparisons In Medical Testing Using Sequential Testing Strategies, Anwar Ahmed

Theses and Dissertations

The practice of sequential testing is followed by the evaluation of accuracy, but often not by the evaluation of cost. This research described and compared three sequential testing strategies: believe the negative (BN), believe the positive (BP) and believe the extreme (BE), the latter being a less-examined strategy. All three strategies were used to combine results of two medical tests to diagnose a disease or medical condition. Descriptions of these strategies were provided in terms of accuracy (using the maximum receiver operating curve or MROC) and cost of testing (defined as the proportion of subjects who need 2 tests to …


On A Comparison Between Two Measures Of Spatial Association, Faisal G. Khamis, Abdul Aziz Jemain, Kamarulzaman Ibrahim May 2010

On A Comparison Between Two Measures Of Spatial Association, Faisal G. Khamis, Abdul Aziz Jemain, Kamarulzaman Ibrahim

Journal of Modern Applied Statistical Methods

Two measures of spatial association between two variables were used by many researchers. These are the Wartenberg (1985) and Lee (2001) measures. Based on simulation for lattice data, the sensitivity of both measures was studied and compared with different choices of spatial structures, spatial weights and sample sizes using bias and mean square error. Different scenarios are used in terms of assumed numbers and sample sizes. Moran’s I is used to examine the spatial autocorrelation of such a variable with itself. Both the Wartenberg and Lee measures are found to be sensitive, however, Wartenberg’s measure is found to be somewhat …


When Sensitivity Is A Function Of Age And Time Spent In The Preclinical State In Periodic Cancer Screening, Dongfeng Wu, Ricolindo L. Cariño, Xiaoqin Wu May 2008

When Sensitivity Is A Function Of Age And Time Spent In The Preclinical State In Periodic Cancer Screening, Dongfeng Wu, Ricolindo L. Cariño, Xiaoqin Wu

Journal of Modern Applied Statistical Methods

Probability models are extended for periodic cancer screening trials to model sensitivity when it is changing with an individual’s age and time spent in the preclinical state. Wu et al. (2005) showed that sensitivity is monotone increasing with age, but intuitively, sensitivity is also a function of the time one has spent in the preclinical stage. This allows us to infer sensitivity at a late stage, just before symptoms manifest. We developed the probability model and applied Bayesian inference to the HIP study group data. The methodology we developed is also applicable to other kinds of chronic diseases.


Generalized Linear Mixed-Effects Models For The Analysis Of Odor Detection Data, Sandra Hall, Matthew S. Mayo, Xu-Feng Niu, James C. Walker Nov 2007

Generalized Linear Mixed-Effects Models For The Analysis Of Odor Detection Data, Sandra Hall, Matthew S. Mayo, Xu-Feng Niu, James C. Walker

Journal of Modern Applied Statistical Methods

Olfactory detection has become a science of interest. Seven individuals’ odor detection abilities are explored and an attempt is made to characterize all subjects with one generalized linear mixed effects model. Two methods of fitting the models were used and simulations were conducted to discover which method yielded the best results.


Evaluating The Roc Performance Of Markers For Future Events, Margaret Pepe, Yingye Zheng, Yuying Jin May 2007

Evaluating The Roc Performance Of Markers For Future Events, Margaret Pepe, Yingye Zheng, Yuying Jin

UW Biostatistics Working Paper Series

Receiver operating characteristic (ROC) curves play a central role in the evaluation of biomarkers and tests for disease diagnosis. Predictors for event time outcomes can also be evaluated with ROC curves, but the time lag between marker measurement and event time must be acknowledged. We discuss different definitions of time-dependent ROC curves in the context of real applications. Several approaches have been proposed for estimation. We contrast retrospective versus prospective methods in regards to assumptions and flexibility, including their capacities to incorporate censored data, competing risks and different sampling schemes. Applications to two datasets are presented.


Simulation Procedure In Periodic Cancer Screening Trials, Ioana Barnicescu, Ricolindo L. Cariño Nov 2005

Simulation Procedure In Periodic Cancer Screening Trials, Ioana Barnicescu, Ricolindo L. Cariño

Journal of Modern Applied Statistical Methods

A general simulation procedure is described to validate model fitting algorithms for complex likelihood functions that are utilized in periodic cancer screening trials. Although screening programs have existed for a few decades, there are still many unsolved problems, such as how age or hormone affects the screening sensitivity, the sojourn time in the preclinical state, and the transition probability from diseasefree state to the preclinical state. Simulations are needed to check reliability or validity of the likelihood function combined with the associated effect functions. One bottleneck in the simulation procedure is the very time consuming calculations of the maximum likelihood …


A Linear Regression Framework For Receiver Operating Characteristic(Roc) Curve Analysis, Zheng Zhang, Margaret S. Pepe May 2005

A Linear Regression Framework For Receiver Operating Characteristic(Roc) Curve Analysis, Zheng Zhang, Margaret S. Pepe

UW Biostatistics Working Paper Series

In the field of medical diagnostic testing, the receiver operating characteristics(ROC) curve has long been used as a standard statistical tool to assess the accuracy of tests that yield continuous results. Although previous research in this area focused mostly on estimating the ROC curve, recently it has been recognized that the accuracy of a given test may fluctuate depending on certain factors, which motivates modelling covariate effects on the ROC curve. Comparing the corresponding ROC curves between two or more tests is a special case of covariate effect modelling. In this manuscript, we introduce a linear regression framework to model …


New Confidence Intervals For The Difference Between Two Sensitivities At A Fixed Level Of Specificity, Gengsheng Qin, Yu-Sheng Hsu, Xiao-Hua Zhou Mar 2005

New Confidence Intervals For The Difference Between Two Sensitivities At A Fixed Level Of Specificity, Gengsheng Qin, Yu-Sheng Hsu, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

For two continuous-scale diagnostic tests, it is of interest to compare their sensitivities at a predetermined level of specificity. In this paper we propose three new intervals for the difference between two sensitivities at a fixed level of specificity. These intervals are easy to compute. We also conduct simulation studies to compare the relative performance of the new intervals with the existing normal approximation based interval proposed by Wieand et al (1989). Our simulation results show that the newly proposed intervals perform better than the existing normal approximation based interval in terms of coverage accuracy and interval length.


Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton Jan 2005

Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton

UW Biostatistics Working Paper Series

Introduction: Markers that purport to distinguish subjects with a condition from those without a condition must be evaluated rigorously for their classification accuracy. A single approach to statistically evaluating and comparing markers is not yet established.

Methods: We suggest a standardization that uses the marker distribution in unaffected subjects as a reference. For an affected subject with marker value Y, the standardized placement value is the proportion of unaffected subjects with marker values that exceed Y.

Results: We apply the standardization to two illustrative datasets. In patients with pancreatic cancer placement values calculated for the CA 19-9 marker are smaller …


Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas May 2004

Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas

The University of Michigan Department of Biostatistics Working Paper Series

There is a lot of interest in the development and characterization of new biomarkers for screening large populations for disease. In much of the literature on diagnostic testing, increased levels of a biomarker correlate with increased disease risk. However, parametric forms are typically used to associate these quantities. In this article, we specify a monotonic relationship between biomarker levels with disease risk. This leads to consideration of a nonparametric regression model for a single biomarker. Estimation results using isotonic regression-type estimators and asymptotic results are given. We also discuss confidence set estimation in this setting and propose three procedures for …


Survival Model Predictive Accuracy And Roc Curves, Patrick Heagerty, Yingye Zheng Dec 2003

Survival Model Predictive Accuracy And Roc Curves, Patrick Heagerty, Yingye Zheng

UW Biostatistics Working Paper Series

The predictive accuracy of a survival model can be summarized using extensions of the proportion of variation explained by the model, or R^2, commonly used for continuous response models, or using extensions of sensitivity and specificity which are commonly used for binary response models.

In this manuscript we propose new time-dependent accuracy summaries based on time-specific versions of sensitivity and specificity calculated over risk sets. We connect the accuracy summaries to a previously proposed global concordance measure which is a variant of Kendall's tau. In addition, we show how standard Cox regression output can be used to obtain estimates of …


Improved Confidence Intervals For The Sensitivity At A Fixed Level Of Specificity Of A Continuous-Scale Diagnostic Test, Xiao-Hua Zhou, Gengsheng Qin May 2003

Improved Confidence Intervals For The Sensitivity At A Fixed Level Of Specificity Of A Continuous-Scale Diagnostic Test, Xiao-Hua Zhou, Gengsheng Qin

UW Biostatistics Working Paper Series

For a continuous-scale test, it is an interest to construct a confidence interval for the sensitivity of the diagnostic test at the cut-off that yields a predetermined level of its specificity (eg. 80%, 90%, or 95%). IN this paper we proposed two new intervals for the sensitivity of a continuous-scale diagnostic test at a fixed level of specificity. We then conducted simulation studies to compare the relative performance of these two intervals with the best existing BCa bootstrap interval, proposed by Platt et al. (2000). Our simulation results showed that the newly proposed intervals are better than the BCa bootstrap …


Estimating The Accuracy Of Polymerase Chain Reaction-Based Tests Using Endpoint Dilution, Jim Hughes, Patricia Totten Mar 2003

Estimating The Accuracy Of Polymerase Chain Reaction-Based Tests Using Endpoint Dilution, Jim Hughes, Patricia Totten

UW Biostatistics Working Paper Series

PCR-based tests for various microorganisms or target DNA sequences are generally acknowledged to be highly "sensitive" yet the concept of sensitivity is ill-defined in the literature on these tests. We propose that sensitivity should be expressed as a function of the number of target DNA molecules in the sample (or specificity when the target number is 0). However, estimating this "sensitivity curve" is problematic since it is difficult to construct samples with a fixed number of targets. Nonetheless, using serially diluted replicate aliquots of a known concentration of the target DNA sequence, we show that it is possible to disentangle …


Semiparametric Receiver Operating Characteristic Analysis To Evaluate Biomarkers For Disease, Tianxi Cai, Margaret S. Pepe Jan 2003

Semiparametric Receiver Operating Characteristic Analysis To Evaluate Biomarkers For Disease, Tianxi Cai, Margaret S. Pepe

UW Biostatistics Working Paper Series

The receiver operating characteristic (ROC) curve is a popular method for characterizing the accuracy of diagnostic tests when test results are not binary. Various methodologies for estimating and comparing ROC curves have been developed. One approach, due to Pepe, uses a parametric regression model with the baseline function specified up to a finite-dimensional parameter. In this article we extend the regression models by allowing arbitrary nonparametric baseline functions. We also provide asymptotic distribution theory and procedures for making statistical inference. We illustrate our approach with dataset from a prostate cancer biomarker study. Simulation studies suggest that the extra flexibility inherent …


The Analysis Of Placement Values For Evaluating Discriminatory Measures, Margaret S. Pepe, Tianxi Cai Sep 2002

The Analysis Of Placement Values For Evaluating Discriminatory Measures, Margaret S. Pepe, Tianxi Cai

UW Biostatistics Working Paper Series

The idea of using measurements such as biomarkers, clinical data, or molecular biology assays for classification and prediction is popular in modern medicine. The scientific evaluation of such measures includes assessing the accuracy with which they predict the outcome of interest. Receiver operating characteristic curves are commonly used for evaluating the accuracy of diagnostic tests. They can be applied more broadly, indeed to any problem involving classification to two states or populations (D = 0 or D = 1). We show that the ROC curve can be interpreted as a cumulative distribution function for the discriminatory measure Y in the …


Assessing The Accuracy Of A New Diagnostic Test When A Gold Standard Does Not Exist, Todd A. Alonzo, Margaret S. Pepe Oct 1998

Assessing The Accuracy Of A New Diagnostic Test When A Gold Standard Does Not Exist, Todd A. Alonzo, Margaret S. Pepe

UW Biostatistics Working Paper Series

Often the accuracy of a new diagnostic test must be assessed when a perfect gold standard does not exist. Use of an imperfect test biases the accuracy estimates of the new test. This paper reviews existing approaches to this problem including discrepant resolution and latent class analysis. Deficiencies with these approaches are identified. A new approach is proposed that combines the results of several imperfect reference tests to define a better reference standard. We call this the composite reference standard (CRS). Using the CRS, accuracy can be assessed using multistage sampling designs. Maximum likelihood estimates of accuracy and expressions for …