Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

12,628 Full-Text Articles 19,901 Authors 6,908,637 Downloads 286 Institutions

All Articles in Statistics and Probability

Faceted Search

12,628 full-text articles. Page 390 of 434.

Assessing The Effect Of Wal-Mart In Rural Utah Areas, Angela Nelson 2011 Brigham Young University - Provo

Assessing The Effect Of Wal-Mart In Rural Utah Areas, Angela Nelson

Theses and Dissertations

Walmart and other “big box” stores seek to expand in rural markets, possibly due to cheap land and lack of zoning laws. In August 2000, Walmart opened a store in Ephraim, a small rural town in central Utah. It is of interest to understand how Walmart's entrance into the local market changes the sales tax revenue base for Ephraim and for the surrounding municipalities. It is thought that small “Mom and Pop” stores go out of business because they cannot compete with Walmart's prices, leading to a decrease in variety, selection, convenience, and most importantly, sales tax revenue base in …


An Introduction To Bayesian Methodology Via Winbugs And Proc Mcmc, Heidi Lula Lindsey 2011 Brigham Young University - Provo

An Introduction To Bayesian Methodology Via Winbugs And Proc Mcmc, Heidi Lula Lindsey

Theses and Dissertations

Bayesian statistical methods have long been computationally out of reach because the analysis often requires integration of high-dimensional functions. Recent advancements in computational tools to apply Markov Chain Monte Carlo (MCMC) methods are making Bayesian data analysis accessible for all statisticians. Two such computer tools are Win-BUGS and SASR 9.2's PROC MCMC. Bayesian methodology will be introduced through discussion of fourteen statistical examples with code and computer output to demonstrate the power of these computational tools in a wide variety of settings.


Using R To Create Synthetic Discrete Response Regression Models, Joseph Hilbe 2011 Arizona State University

Using R To Create Synthetic Discrete Response Regression Models, Joseph Hilbe

Joseph M Hilbe

The creation of synthetic models allows a researcher to better understand models as well as the bias that can occur when the assumptions upon which a model is based is violated. This article provides R code that can be used or amended to create a variety of discrete response regression models.


Gene Set Analysis For Longitudinal Gene Expression Data, Ke Zhang, Haiyan Wang, Arne C. Bathke, Solomon W. Harrar, Hans-Peter Piepho, Youping Deng 2011 University of North Dakota

Gene Set Analysis For Longitudinal Gene Expression Data, Ke Zhang, Haiyan Wang, Arne C. Bathke, Solomon W. Harrar, Hans-Peter Piepho, Youping Deng

Statistics Faculty Publications

BACKGROUND: Gene set analysis (GSA) has become a successful tool to interpret gene expression profiles in terms of biological functions, molecular pathways, or genomic locations. GSA performs statistical tests for independent microarray samples at the level of gene sets rather than individual genes. Nowadays, an increasing number of microarray studies are conducted to explore the dynamic changes of gene expression in a variety of species and biological scenarios. In these longitudinal studies, gene expression is repeatedly measured over time such that a GSA needs to take into account the within-gene correlations in addition to possible between-gene correlations.

RESULTS: We provide …


How To Combine Independent Data Sets For The Same Quantity, Theodore P. Hill, Jack Miller 2011 Georgia Institute of Technology - Main Campus

How To Combine Independent Data Sets For The Same Quantity, Theodore P. Hill, Jack Miller

Research Scholars in Residence

This paper describes a new mathematical method called conflation for consolidating data from independent experiments that measure the same physical quantity. Conflation is easy to calculate and visualize and minimizes the maximum loss in Shannon information in consolidating several independent distributions into a single distribution. A formal mathematical treatment of conflation has recently been published. For the benefit of experimenters wishing to use this technique, in this paper we derive the principal basic properties of conflation in the special case of normally distributed (Gaussian) data. Examples of applications to measurements of the fundamental physical constants and in high energy physics …


Helin Institutions' Collection Statistics From Fy 10 To Fy 11, Martha Rice Sanders 2011 HELIN Consortium

Helin Institutions' Collection Statistics From Fy 10 To Fy 11, Martha Rice Sanders

HELIN Collection Statistics

Statistical information about the total number of item and holdings (serials) records held by each HELIN member institution as of June 30, 2010, and June 30, 2011. Gives the percentage of growth for each institution. Additionally, a chart and statistics for the number of item records held by each HELIN member institution as of June 30, 2011. A Chart of e-book collection totals and the libraries to which they belong. Finally, a chart of serials holdings for both paper (plus microform, etc.) and electronic journals, including the CRIARL libraries.


Comparing Hall Of Fame Baseball Players Using Most Valuable Player Ranks, Paul Kvam 2011 University of Richmond

Comparing Hall Of Fame Baseball Players Using Most Valuable Player Ranks, Paul Kvam

Department of Math & Statistics Faculty Publications

We propose a rank-based statistical procedure for comparing performances of top major league baseball players who performed in different eras. The model is based on using the player ranks from voting results for the most valuable player awards in the American and National Leagues. The current voting procedure has remained the same since 1932, so the analysis regards only data for players whose career blossomed after that time. Because the analysis is based on quantiles, its basis is nonparametric and relies on a simple link function. Results are stratified by fielding position, and we compare 73 Hall of Fame players …


A Study On Facility Planning Using Discrete Event Simulation: Case Study Of A Grain Delivery Terminal., Sarah M. Asio 2011 University of Nebraska-Lincoln

A Study On Facility Planning Using Discrete Event Simulation: Case Study Of A Grain Delivery Terminal., Sarah M. Asio

Department of Industrial and Management Systems Engineering: Dissertations, Theses, and Student Research

The application of traditional approaches to the design of efficient facilities can be tedious and time consuming when uncertainty and a number of constraints exist. Queuing models and mathematical programming techniques are not able to capture the complex interaction between resources, the environment and space constraints for dynamic stochastic processes. In the following study discrete event simulation is applied to the facility planning process for a grain delivery terminal. The discrete event simulation approach has been applied to studies such as capacity planning and facility layout for a gasoline station and evaluating the resource requirements for a manufacturing facility. To …


A Bayesian Model Averaging Approach For Observational Gene Expression Studies, Xi Kathy Zhou, Fei Liu, Andrew J. Dannenberg 2011 Division of Biostatistics and Epidemiology, Department of Public Health, Weill Cornell Medical College

A Bayesian Model Averaging Approach For Observational Gene Expression Studies, Xi Kathy Zhou, Fei Liu, Andrew J. Dannenberg

COBRA Preprint Series

Identifying differentially expressed (DE) genes associated with a sample characteristic is the primary objective of many microarray studies. As more and more studies are carried out with observational rather than well controlled experimental samples, it becomes important to evaluate and properly control the impact of sample heterogeneity on DE gene finding. Typical methods for identifying DE genes require ranking all the genes according to a pre-selected statistic based on a single model for two or more group comparisons, with or without adjustment for other covariates. Such single model approaches unavoidably result in model misspecification, which can lead to increased error …


When Does Combining Markers Improve Classification Performance And What Are Implications For Practice?, Aasthaa Bansal, Margaret Sullivan Pepe 2011 University of Washington

When Does Combining Markers Improve Classification Performance And What Are Implications For Practice?, Aasthaa Bansal, Margaret Sullivan Pepe

UW Biostatistics Working Paper Series

When an existing standard marker does not have sufficient classification accuracy on its own, new markers are sought with the goal of yielding a combination with better performance. The primary criterion for selecting new markers is that they have good performance on their own and preferably be uncorrelated with the standard. Most often linear combinations are considered. In this paper we investigate the increment in performance that is possible by combining a novel continuous marker with a moderately performing standard continuous marker under a variety of biologically motivated models for their joint distribution. We find that an uncorrelated continuous marker …


Hierarchical Probit Models For Ordinal Ratings Data, Allison M. Butler 2011 Brigham Young University - Provo

Hierarchical Probit Models For Ordinal Ratings Data, Allison M. Butler

Theses and Dissertations

University students often complete evaluations of their courses and instructors. The evaluation tool typically contains questions about the course and the instructor on an ordinal Likert scale. We assess instructor effectiveness while adjusting for known confounders. We present a probit regression model with a latent variable to measure the instructor effectiveness accounting for student specific covariates, such as student grade in the course, high school and university GPA, and ACT score.


Targeted Maximum Likelihood Estimation Of Conditional Relative Risk In A Semi-Parametric Regression Model, Cathy Tuglus, Kristin E. Porter, Mark J. van der Laan 2011 University of California, Berkeley

Targeted Maximum Likelihood Estimation Of Conditional Relative Risk In A Semi-Parametric Regression Model, Cathy Tuglus, Kristin E. Porter, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The conditional relative risk is an important measure in medical and epidemiological studies when the outcome of interest is binary (i.e. disease vs. no disease). When the outcome is common, estimation of conditional relative risk and related parameters can be problematic, especially when the exposure or covariates are continuous. We propose a new estimation procedure based on targeted maximum likelihood methodology that targets the parameters relating to the conditional relative risk for common outcomes under a log-linear, or multiplicative, semi-parametric model. In this paper, we present three possible targeted maximum likelihood estimators for relative risk parameters implied by such a …


Probabilistic Assessment Of Drought Characteristics Using A Hidden Markov Model, Ganeshchandra Mallya, Shivam Tripathi, Sergey Kirshner, Rao S. Govindaraju 2011 Purdue University

Probabilistic Assessment Of Drought Characteristics Using A Hidden Markov Model, Ganeshchandra Mallya, Shivam Tripathi, Sergey Kirshner, Rao S. Govindaraju

2011 Symposium on Data-Driven Approaches to Droughts

Droughts are evaluated using drought indices that measure the departure of meteorological and hydrological variables such as precipitation and stream flow from their long-term averages. While there are many drought indices proposed in the literature, most of them use pre-defined thresholds for identifying drought classes ignoring the inherent uncertainties in characterizing droughts. In this study, a hidden Markov model (HMM) [1] is developed for probabilistic classification of drought states. The HMM captures space and time dependence in the data. The proposed model is applied to assess drought characteristics in Indiana using monthly precipitation and stream flow data. The comparison of …


Super Learner Based Conditional Density Estimation With Application To Marginal Structural Models, Ivan Diaz Munoz, Mark J. van der Laan 2011 University of California, Berkeley, School of Public Health - Division of Biostatistics

Super Learner Based Conditional Density Estimation With Application To Marginal Structural Models, Ivan Diaz Munoz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper we present a histogram-like estimator of a conditional density that uses super learner crossvalidation to estimate the histogram probabilities, as well as the optimal number and position of the bins. This estimator is an alternative to kernel density estimators when the dimension of the problem is large. We demonstrate its applicability to estimation of Marginal Structural Model (MSM) parameters in which an initial estimator of the treatment %mechanism is needed. MSM estimation based on the proposed density estimator results in less biased estimates, when compared to estimates based on a misspecified parametric model.


Comparing Roc Curves Derived From Regression Models, Venkatraman E. Seshan, Mithat Gonen, Colin B. Begg 2011 Memorial Sloan-Kettering Cancer Center

Comparing Roc Curves Derived From Regression Models, Venkatraman E. Seshan, Mithat Gonen, Colin B. Begg

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

In constructing predictive models, investigators frequently assess the incremental value of a predictive marker by comparing the ROC curve generated from the predictive model including the new marker with the ROC curve from the model excluding the new marker. Many commentators have noticed empirically that a test of the two ROC areas often produces a non-significant result when a corresponding Wald test from the underlying regression model is significant. A recent article showed using simulations that the widely-used ROC area test [1] produces exceptionally conservative test size and extremely low power [2]. In this article we show why the ROC …


U.S. Cultural Involvement And Its Association With Co-Occurring Substance Abuse And Sexual Risk Behaviors Among Youth In The Dominican Republic, Elián P. Cabrera-Nguyen, Juan B. Peña 2011 Washington University in St. Louis

U.S. Cultural Involvement And Its Association With Co-Occurring Substance Abuse And Sexual Risk Behaviors Among Youth In The Dominican Republic, Elián P. Cabrera-Nguyen, Juan B. Peña

Elián P. Cabrera-Nguyen

We examined the relationship of US cultural involvement with substance abuse and sexual risk behavior profiles from our nationally representative sample of public high school students in the Dominican Republic. Using a novel methodological approach to control for selection bias, we examined explanations for the so-called Latino or Hispanic immigrant paradox. A latent class regression analysis with manifest and latent covariates found that US cultural involvement indicators were independent and robust predictors of increased risk of co-ocurring substance abuse and sexual risk behaviors. Implications for prevention efforts targeting risk behaviors among Latino/a adolescents in the US and abroad are considered.


On Causal Mediation Analysis With A Survival Outcome, Eric J. Tchetgen Tchetgen 2011 Harvard University

On Causal Mediation Analysis With A Survival Outcome, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

Suppose that having established a marginal total effect of a point exposure on a time-to-event outcome, an investigator wishes to decompose this effect into its direct and indirect pathways, also know as natural direct and indirect effects, mediated by a variable known to occur after the exposure and prior to the outcome. This paper proposes a theory of estimation of natural direct and indirect effects in two important semiparametric models for a failure time outcome. The underlying survival model for the marginal total effect and thus for the direct and indirect effects, can either be a marginal structural Cox proportional …


Semiparametric Estimation Of Models For Natural Direct And Indirect Effects, Eric J. Tchetgen Tchetgen, Ilya Shpitser 2011 Harvard University

Semiparametric Estimation Of Models For Natural Direct And Indirect Effects, Eric J. Tchetgen Tchetgen, Ilya Shpitser

Harvard University Biostatistics Working Paper Series

In recent years, researchers in the health and social sciences have become increasingly interested in mediation analysis. Specifically, upon establishing a non-null total effect of an exposure, investigators routinely wish to make inferences about the direct (indirect) pathway of the effect of the exposure not through (through) a mediator variable that occurs subsequently to the exposure and prior to the outcome. Natural direct and indirect effects are of particular interest as they generally combine to produce the total effect of the exposure and therefore provide insight on the mechanism by which it operates to produce the outcome. A semiparametric theory …


Semiparametric Theory For Causal Mediation Analysis: Efficiency Bounds, Multiple Robustness, And Sensitivity Analysis, Eric J. Tchetgen Tchetgen, Ilya Shpitser 2011 Harvard University

Semiparametric Theory For Causal Mediation Analysis: Efficiency Bounds, Multiple Robustness, And Sensitivity Analysis, Eric J. Tchetgen Tchetgen, Ilya Shpitser

Harvard University Biostatistics Working Paper Series

Whilst estimation of the marginal (total) causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. Specifically, upon establishing a non-null total effect of the exposure, investigators routinely wish to make inferences about the direct (indirect) pathway of the effect of the exposure not through (through) a mediator variable that occurs subsequently to the exposure and prior to the outcome. Although powerful semiparametric methodologies have been developed to analyze observational studies, that …


Component Extraction Of Complex Biomedical Signal And Performance Analysis Based On Different Algorithm, hemant pasusangai kasturiwale 2011 university of mumbai,India

Component Extraction Of Complex Biomedical Signal And Performance Analysis Based On Different Algorithm, Hemant Pasusangai Kasturiwale

Johns Hopkins University, Dept. of Biostatistics Working Papers

Biomedical signals can arise from one or many sources including heart ,brains and endocrine systems. Multiple sources poses challenge to researchers which may have contaminated with artifacts and noise. The Biomedical time series signal are like electroencephalogram(EEG),electrocardiogram(ECG),etc The morphology of the cardiac signal is very important in most of diagnostics based on the ECG. The diagnosis of patient is based on visual observation of recorded ECG,EEG,etc, may not be accurate. To achieve better understanding , PCA (Principal Component Analysis) and ICA algorithms helps in analyzing ECG signals . The immense scope in the field of biomedical-signal processing Independent Component Analysis( …


Digital Commons powered by bepress