Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

11,685 Full-Text Articles 17,192 Authors 3,366,482 Downloads 237 Institutions

All Articles in Statistics and Probability

Faceted Search

11,685 full-text articles. Page 343 of 351.

Informatics And Statistics For Analyzing 2-D Gel Electrophoresis Images, Andrew W. Dowsey, Jeffrey S. Morris, Howard G. Gutstein, Guang Z. Yang 2010 Imperial College London

Informatics And Statistics For Analyzing 2-D Gel Electrophoresis Images, Andrew W. Dowsey, Jeffrey S. Morris, Howard G. Gutstein, Guang Z. Yang

Jeffrey S. Morris

Whilst recent progress in ‘shotgun’ peptide separation by integrated liquid chromatography and mass spectrometry (LC/MS) has enabled its use as a sensitive analytical technique, proteome coverage and reproducibility is still limited and obtaining enough replicate runs for biomarker discovery is a challenge. For these reasons, recent research demonstrates the continuing need for protein separation by two-dimensional gel electrophoresis (2-DE). However, with traditional 2-DE informatics, the digitized images are reduced to symbolic data though spot detection and quantification before proteins are compared for differential expression by spot matching. Recently, a more robust and automated paradigm has emerged where gels are ...


Bayesian Random Segmentationmodels To Identify Shared Copy Number Aberrations For Array Cgh Data, Veerabhadran Baladandayuthapani, Yuan Ji, Rajesh Talluri, Luis E. Nieto-Barajas, Jeffrey S. Morris 2010 Texas A&M University

Bayesian Random Segmentationmodels To Identify Shared Copy Number Aberrations For Array Cgh Data, Veerabhadran Baladandayuthapani, Yuan Ji, Rajesh Talluri, Luis E. Nieto-Barajas, Jeffrey S. Morris

Jeffrey S. Morris

Array-based comparative genomic hybridization (aCGH) is a high-resolution high-throughput technique for studying the genetic basis of cancer. The resulting data consists of log fluorescence ratios as a function of the genomic DNA location and provides a cytogenetic representation of the relative DNA copy number variation. Analysis of such data typically involves estimation of the underlying copy number state at each location and segmenting regions of DNA with similar copy number states. Most current methods proceed by modeling a single sample/array at a time, and thus fail to borrow strength across multiple samples to infer shared regions of copy number ...


Code For Fitting Bdsacgh, Veera Baladandayuthapani 2010 UT MD Anderson Cancer Center

Code For Fitting Bdsacgh, Veera Baladandayuthapani

Veera Baladandayuthapani

No abstract provided.


R Package For Bayesian Ensemble Methods For Survival Prediction In Gene Expression Data, Veera Baladandayuthapani 2010 UT MD Anderson Cancer Center

R Package For Bayesian Ensemble Methods For Survival Prediction In Gene Expression Data, Veera Baladandayuthapani

Veera Baladandayuthapani

This is the R package for the methods described in Bayesian ensemble methods for survival prediction in gene expression data by Vinicius Bonato , Veerabhadran Baladandayuthapani, Kim-Anh Do, Bradley M. Broom, Erik P. Sulman, and Kenneth D. Aldape Submitted to Bioinformatics (2010)


Bayesian Random Segmentationmodels To Identify Shared Copy Number Aberrations For Array Cgh Data, Veera Baladandayuthapani 2010 UT MD Anderson Cancer Center

Bayesian Random Segmentationmodels To Identify Shared Copy Number Aberrations For Array Cgh Data, Veera Baladandayuthapani

Veera Baladandayuthapani

No abstract provided.


Participation And Engagement In Sport: A Double Hurdle Approach For The United Kingdom, Babatunde Buraimo, Brad Humphreys, Rob Simmons 2010 University of Central Lancashire

Participation And Engagement In Sport: A Double Hurdle Approach For The United Kingdom, Babatunde Buraimo, Brad Humphreys, Rob Simmons

Dr Babatunde Buraimo

This paper uses pooled cross-section data from four waves of the United Kingdom’s Taking Part Survey, 2005 to 2009, in order to investigate determinants of probability of participation and levels of engagement in sports. The two rival modelling approaches considered here are the double-hurdle approach and the Heckman sample selection model. The Heckman model proves to be deficient in several key respects. The double-hurdle approach offers more reliable estimates than the Heckman sample selection model, at least for this particular survey. The distinction is more than just statistical nuance as there are substantive differences in qualitative results from the ...


Estimating Smooth Distribution Function In The Presence Of Heteroscedastic Measurement Errors, Xiao-Feng Wang, Zhaozhi Fan, Bin Wang 2010 Cleveland Clinic Lerner Research Institute

Estimating Smooth Distribution Function In The Presence Of Heteroscedastic Measurement Errors, Xiao-Feng Wang, Zhaozhi Fan, Bin Wang

Xiaofeng Wang

Measurement error occurs in many biomedical fields. The challenges arise when errors are heteroscedastic since we literally have only one observation for each error distribution. This paper concerns the estimation of smooth distribution function when data are contaminated with heteroscedastic errors. We study two types of methods to recover the unknown distribution function: a Fourier-type deconvolution method and a simulation extrapolation (SIMEX) method. The asymptotics of the two estimators are explored and the asymptotic pointwise confidence bands of the SIMEX estimator are obtained. The finite sample performances of the two estimators are evaluated through a simulation study. Finally, we illustrate ...


Simulating Multivariate G-And-H Distributions, Rhonda K. Kowalchuk, Todd C. Headrick 2010 Southern Illinois University Carbondale

Simulating Multivariate G-And-H Distributions, Rhonda K. Kowalchuk, Todd C. Headrick

Todd Christopher Headrick

The Tukey family of g-and-h distributions is often used to model univariate real-world data. There is a paucity of research demonstrating appropriate multivariate data generation using the g-and-h family of distributions with specified correlations. Therefore, the methodology and algorithms are presented to extend the g-and-h family from univariate to multivariate data generation. An example is provided along with a Monte Carlo simulation demonstrating the methodology. In addition, algorithms written in Mathematica 7.0 are available from the authors for implementing the procedure.


Statistical Simulation: Power Method Polynomials And Other Transformations, Todd C. Headrick 2010 Southern Illinois University Carbondale

Statistical Simulation: Power Method Polynomials And Other Transformations, Todd C. Headrick

Todd Christopher Headrick

Although power method polynomials based on the standard normal distributions have been used in many different contexts for the past 30 years, it was not until recently that the probability density function (pdf) and cumulative distribution function (cdf) were derived and made available. Focusing on both univariate and multivariate nonnormal data generation, Statistical Simulation: Power Method Polynomials and Other Transformations presents techniques for conducting a Monte Carlo simulation study. It shows how to use power method polynomials for simulating univariate and multivariate nonnormal distributions with specified cumulants and correlation matrices. The book first explores the methodology underlying the power method ...


Identification Of Ovarian Cancer Symptoms In Health Insurance Claims Data., Paula Diehr, Sean Devlin 2010 University of Washington

Identification Of Ovarian Cancer Symptoms In Health Insurance Claims Data., Paula Diehr, Sean Devlin

Paula Diehr

Background: Women with ovarian cancer have reported abdominal=pelvic pain, bloating, difficulty eating or feeling full quickly, and urinary frequency=urgency prior to diagnosis. We explored these findings in a general population using a dataset of insured women aged 40–64 and investigated the potential effectiveness of a routine review of claims data as a prescreen to identify women at high risk for ovarian cancer. Methods: Data from a large Washington State health insurer were merged with the Seattle-Puget Sound Surveillance, Epidemiology and End Results (SEER) cancer registry for 2000–2004. We estimated the prevalence of symptoms in the 36 ...


Targeted Maximum Likelihood Estimation Of The Parameter Of A Marginal Structural Model, Michael Rosenblum, Mark J. van der Laan 2010 Johns Hopkins University

Targeted Maximum Likelihood Estimation Of The Parameter Of A Marginal Structural Model, Michael Rosenblum, Mark J. Van Der Laan

Michael Rosenblum

Targeted maximum likelihood estimation is a versatile tool for estimating parameters in semiparametric and nonparametric models. We work through an example applying targeted maximum likelihood methodology to estimate the parameter of a marginal structural model. In the case we consider, we show how this can be easily done by clever use of standard statistical software. We point out differences between targeted maximum likelihood estimation and other approaches (including estimating function based methods). The application we consider is to estimate the effect of adherence to antiretroviral medications on virologic failure in HIV positive individuals.


Application Of Causal Inference Methods To Improve The Treatment Of Hiv In Resource-Limited Settings., Maya Petersen 2010 University of California, Berkeley

Application Of Causal Inference Methods To Improve The Treatment Of Hiv In Resource-Limited Settings., Maya Petersen

Maya Petersen

No abstract provided.


Sphet: Spatial Models With Heteroskedastic Innovations In R, Gianfranco Piras 2010 Regional Research Institute, West Virginia University

Sphet: Spatial Models With Heteroskedastic Innovations In R, Gianfranco Piras

Gianfranco Piras

No abstract provided.


Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh 2010 Penn State University

Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh

Debashis Ghosh

In high-throughput studies involving genetic data such as from gene expression mi- croarrays, dierential expression analysis between two or more experimental conditions has been a very common analytical task. Much of the resulting literature on multiple comparisons has paid relatively little attention to the choice of test statistic. In this article, we focus on the issue of choice of test statistic based on a special pattern of dierential expression. The approach here is based on recasting multiple comparisons procedures for assessing outlying expression values. A major complication is that the resulting p-values are discrete; some theoretical properties of sequential testing ...


Detecting Outlier Genes From High-Dimensional Data: A Fuzzy Approach, Debashis Ghosh 2010 Penn State University

Detecting Outlier Genes From High-Dimensional Data: A Fuzzy Approach, Debashis Ghosh

Debashis Ghosh

A recent nding in cancer research has been the characterization of previously undis- covered chromosomal abnormalities in several types of solid tumors. This was found based on analyses of high-throughput data from gene expression microarrays and motivated the development of so-called `outlier' tests for dierential expression. One statistical issue was the potential discreteness of the test statistics. Using ideas from fuzzy set theory, we develop fuzzy outlier detection algorithms that have links to ideas in multiple comparisons. Two- and K-sample extensions are considered. The methodology is illustrated by application to two microarray studies.


Links Between Analysis Of Surrogate Endpoints And Endogeneity, Debashis Ghosh, Jeremy M. Taylor, Michael R. Elliott 2010 Penn State University

Links Between Analysis Of Surrogate Endpoints And Endogeneity, Debashis Ghosh, Jeremy M. Taylor, Michael R. Elliott

Debashis Ghosh

There has been substantive interest in the assessment of surrogate endpoints in medical research. These are measures which could potentially replace \true" endpoints in clinical trials and lead to studies that require less follow-up. Recent research in the area has focused on assessments using causal inference frameworks. Beginning with a simple model for associating the surrogate and true endpoints in the population, we approach the problem as one of endogenous covariates. An instrumental variables estimator and general two-stage algorithm is proposed. Existing surrogacy frameworks are then evaluated in the context of the model. A numerical example is used to illustrate ...


Meta-Analysis For Surrogacy: Accelerated Failure Time Models And Semicompeting Risks Modelling, Debashis Ghosh, Jeremy M. Taylor, Daniel J. Sargent 2010 Penn State University

Meta-Analysis For Surrogacy: Accelerated Failure Time Models And Semicompeting Risks Modelling, Debashis Ghosh, Jeremy M. Taylor, Daniel J. Sargent

Debashis Ghosh

There has been great recent interest in the medical and statistical literature in the assessment and validation of surrogate endpoints as proxies for clinical endpoints in medical studies. More recently, authors have focused on using meta-analytical methods for quanti cation of surrogacy. In this article, we extend existing procedures for analysis based on the accelerated failure time model to this setting. An advantage of this approach relative to proportional hazards model is that it allows for analysis in the semi-competing risks setting, where we constrain the surrogate endpoint to occur before the true endpoint. A novel principal components procedure is ...


Spline-Based Models For Predictiveness Curves, Debashis Ghosh, Michael Sabel 2010 Penn State University

Spline-Based Models For Predictiveness Curves, Debashis Ghosh, Michael Sabel

Debashis Ghosh

A biomarker is dened to be a biological characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention. The use of biomarkers in cancer has been advocated for a variety of purposes, which include use as surrogate endpoints, early detection of disease, proxies for environmental exposure and risk prediction. We deal with the latter issue in this paper. Several authors have proposed use of the predictiveness curve for assessing the capacity of a biomarker for risk prediction. For most situations, it is reasonable to assume monotonicity of ...


Combining Multiple Models With Survival Data: The Phase Algorithm, Debashis Ghosh, Zheng Yuan 2010 Penn State University

Combining Multiple Models With Survival Data: The Phase Algorithm, Debashis Ghosh, Zheng Yuan

Debashis Ghosh

In many scientic studies, one common goal is to develop good prediction rules based on a set of available measurements. This paper proposes a model averaging methodology using proportional hazards regression models to construct new estimators of predicted survival probabilities. A screening step based on an adaptive searching algorithm is used to handle large numbers of covariates. The nite-sample properties of the proposed methodology is assessed using simulation studies. Application of the method to a cancer biomarker study is also given.


Author Guidelines For Reporting Scale Development And Validation Results In The Journal Of The Society For Social Work And Research, Peter Cabrera-Nguyen 2010 Washington University in St. Louis

Author Guidelines For Reporting Scale Development And Validation Results In The Journal Of The Society For Social Work And Research, Peter Cabrera-Nguyen

Elián P. Cabrera-Nguyen

In this invited article, Cabrera-Nguyen provides guidelines for reporting scale development and validation results. Authors' attention to these guidelines will help ensure the research reported in JSSWR is rigorous and of high quality. This article provides guidance for those using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). In addition, the article provides helpful links to resources addressing structural equation modeling, multiple imputation for missing data, and a general resource for quantitative data analysis.


Digital Commons powered by bepress