Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

2010

Series

Discipline
Institution
Keyword
Publication

Articles 1 - 22 of 22

Full-Text Articles in Statistical Models

A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez Nov 2010

A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez

COBRA Preprint Series

We present a novel approach to address genome association studies between single nucleotide polymorphisms (SNPs) and disease. We propose a Bayesian shared component model to tease out the genotype information that is common to cases and controls from the one that is specific to cases only. This allows to detect the SNPs that show the strongest association with the disease. The model can be applied to case-control studies with more than one disease. In fact, we illustrate the use of this model with a dataset of 23,418 SNPs from a case-control study by The Welcome Trust Case Control Consortium (2007) …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Geographic Factors Of Residential Burglaries - A Case Study In Nashville, Tennessee, Jonathan A. Hall Nov 2010

Geographic Factors Of Residential Burglaries - A Case Study In Nashville, Tennessee, Jonathan A. Hall

Masters Theses & Specialist Projects

This study examines geographic patterns and geographic factors of residential burglary at the Nashville, TN area for a twenty year period at five year interval starting in 1988. The purpose of this study is to identify what geographic factors have impacted on residential burglary rates, and if there were changes in the geographic patterns of residential burglary over the study period. Several criminological theories guide this study, with the most prominent being Social Disorganization Theory and Routine Activities Theory. Both of these theories focus on the relationships of place and crime. A number of spatial analysis methods are hence adopted …


The Statistical Properties Of The Survivor Interaction Contrast, Joseph W. Houpt, James T. Townsend Oct 2010

The Statistical Properties Of The Survivor Interaction Contrast, Joseph W. Houpt, James T. Townsend

Psychology Faculty Publications

The Survivor Interaction Contrast (SIC) is a powerful tool for assessing the architecture and stopping rule of a model of mental processes. Despite its demonstrated utility, the methodology has lacked a method for statistical testing until now. In this paper we briefly describe the SIC then develop some basic statistical properties of the measure. These developments lead to a statistical test for rejecting certain classes of models based on the SIC. We verify these tests using simulated data, then demonstrate their use on data from a simple cognitive task.


Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei Sep 2010

Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Mixed Effect Poisson Log-Linear Models For Clinical And Epidemiological Sleep Hypnogram Data, Bruce J. Swihart, Brian S. Caffo Phd, Ciprian Crainiceanu Phd, Naresh M. Punjabi Phd, Md Aug 2010

Mixed Effect Poisson Log-Linear Models For Clinical And Epidemiological Sleep Hypnogram Data, Bruce J. Swihart, Brian S. Caffo Phd, Ciprian Crainiceanu Phd, Naresh M. Punjabi Phd, Md

Johns Hopkins University, Dept. of Biostatistics Working Papers

Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of individuals and repeated measures within those individuals, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation …


Estimating Teacher Effects Using Value-Added Models, Jennifer L. Green Aug 2010

Estimating Teacher Effects Using Value-Added Models, Jennifer L. Green

Department of Statistics: Dissertations, Theses, and Student Work

Value-added modeling is an alternative approach to test-based accountability systems based on the proportions of students scoring at or above pre-determined proficiency levels. Value-added modeling techniques provide opportunities to estimate an individual teacher’s effect on student learning, while allowing for the possibility to control for the effect of non-educational factors beyond a school system’s control, such as socioeconomic status. However, numerous considerations exist when using value-added models to estimate teacher effects and defining what the teacher effects really describe. Chapter 2 provides an introduction to value-added methodology by describing several value-added models available for estimating teacher effects and their respective …


Principled Sure Independence Screening For Cox Models With Ultra-High-Dimensional Covariates, Sihai Dave Zhao, Yi Li Jul 2010

Principled Sure Independence Screening For Cox Models With Ultra-High-Dimensional Covariates, Sihai Dave Zhao, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Unified Approach To Modeling Multivariate Binary Data Using Copulas Over Partitions, Bruce J. Swihart, Brian Caffo, Ciprian Crainiceanu Jul 2010

A Unified Approach To Modeling Multivariate Binary Data Using Copulas Over Partitions, Bruce J. Swihart, Brian Caffo, Ciprian Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

Many seemingly disparate approaches for marginal modeling have been developed in recent years. We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the proposed copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate …


Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin Apr 2010

Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Nonparametric And Semiparametric Analysis Of Current Status Data Subject To Outcome Misclassification, Victor G. Sal Y Rosas, James P. Hughes Apr 2010

Nonparametric And Semiparametric Analysis Of Current Status Data Subject To Outcome Misclassification, Victor G. Sal Y Rosas, James P. Hughes

UW Biostatistics Working Paper Series

In this article, we present nonparametric and semiparametric methods to analyze current status data subject to outcome misclassification. Our methods use nonparametric maximum likelihood estimation (NPMLE) to estimate the distribution function of the failure time when sensitivity and specificity may vary among subgroups. A nonparametric test is proposed for the two sample hypothesis testing. In regression analysis, we apply the Cox proportional hazard model and likelihood ratio based confidence intervals for the regression coefficients are proposed. Our methods are motivated and demonstrated by data collected from an infectious disease study in Seattle, WA.


An Analysis Of Nonignorable Nonresponse In A Survey With A Rotating Panel Design, Caterina Giusti, Roderick J. Little Mar 2010

An Analysis Of Nonignorable Nonresponse In A Survey With A Rotating Panel Design, Caterina Giusti, Roderick J. Little

The University of Michigan Department of Biostatistics Working Paper Series

Missing values to income questions are common in survey data. When the probabilities of nonresponse are assumed to depend on the observed information and not on the underlining unobserved amounts, the missing income values are missing at random (MAR), and methods such as sequential multiple imputation can be applied. However, the MAR assumption is often considered questionable in this context, since missingness of income is thought to be related to the value of income itself, after conditioning on available covariates. In this article we describe a sensitivity analysis based on a pattern-mixture model for deviations from MAR, in the context …


The Location Decisions Of Foreign Investors In China: Untangling The Effect Of Wages Using A Control Function Approach, Xuepeng Liu, Mary E. Lovely, Jan Ondrich Feb 2010

The Location Decisions Of Foreign Investors In China: Untangling The Effect Of Wages Using A Control Function Approach, Xuepeng Liu, Mary E. Lovely, Jan Ondrich

Faculty and Research Publications

There is almost no support for the proposition that capital is attracted to low wages from firm-level studies. We examine the location choices of 2,884 firms investing in China between 1993 and 1996 to offer two main contributions. First, we find that the location of labor-intensive activities is highly elastic to provincial wage differences. Generally, investors' wage sensitivity declines as the skill intensity of the industry increases. Second, we find that unobserved location-specific attributes exert a downward bias on estimated wage sensitivity. Using a control function approach, we estimate a downward bias of 50% to 90% in wage coefficients estimated …


Robustness Of Approaches To Roc Curve Modeling Under Misspecification Of The Underlying Probability Model, Sean Devlin, Elizabeth Thomas, Scott S. Emerson Jan 2010

Robustness Of Approaches To Roc Curve Modeling Under Misspecification Of The Underlying Probability Model, Sean Devlin, Elizabeth Thomas, Scott S. Emerson

UW Biostatistics Working Paper Series

The receiver operating characteristic (ROC) curve is a tool of particular use in disease status classification with a continuous medical test (marker). A variety of statistical regression models have been proposed for the comparison of ROC curves for different markers across covariate groups. A full parametric modeling of the marker distribution has been generally found to be overly reliant on the strong parametric assumptions. Pepe (2003) has instead developed parametric models for the ROC curve that induce a semi-parametric model for the marker distributions. The estimating equations proposed for use in these ROC-GLM models may differ from commonly used estimating …


Route Choice Behavior In Risky Networks With Real-Time Information, Michael D. Razo Jan 2010

Route Choice Behavior In Risky Networks With Real-Time Information, Michael D. Razo

Masters Theses 1911 - February 2014

This research investigates route choice behavior in networks with risky travel times and real-time information. A stated preference survey is conducted in which subjects use a PC-based interactive maps to choose routes link-by-link in various scenarios. The scenarios include two types of maps: the first presenting a choice between one stochastic route and one deterministic route, and the second with real-time information and an available detour. The first type measures the basic risk attitude of the subject. The second type allows for strategic planning, and measures the effect of this opportunity on subjects' choice behavior.

Results from each subject are …


Dynamic Model Pooling Methodology For Improving Aberration Detection Algorithms, Brenton J. Sellati Jan 2010

Dynamic Model Pooling Methodology For Improving Aberration Detection Algorithms, Brenton J. Sellati

Masters Theses 1911 - February 2014

Syndromic surveillance is defined generally as the collection and statistical analysis of data which are believed to be leading indicators for the presence of deleterious activities developing within a system. Conceptually, syndromic surveillance can be applied to any discipline in which it is important to know when external influences manifest themselves in a system by forcing it to depart from its baseline. Comparing syndromic surveillance systems have led to mixed results, where models that dominate in one performance metric are often sorely deficient in another. This results in a zero-sum trade off where one performance metric must be afforded greater …


Economic Risk Assessment Using The Fractal Market Hypothesis, Jonathan Blackledge, Marek Rebow Jan 2010

Economic Risk Assessment Using The Fractal Market Hypothesis, Jonathan Blackledge, Marek Rebow

Conference papers

This paper considers the Fractal Market Hypothesi (FMH) for assessing the risk(s) in developing a financial portfolio based on data that is available through the Internet from an increasing number of sources. Most financial risk management systems are still based on the Efficient Market Hypothesis which often fails due to the inaccuracies of the statistical models that underpin the hypothesis, in particular, that financial data are based on stationary Gaussian processes. The FMH considered in this paper assumes that financial data are non-stationary and statistically self-affine so that a risk analysis can, in principal, be applied at any time scale …


Encryption Using Deterministic Chaos, Jonathan Blackledge, Nikolai Ptitsyn Jan 2010

Encryption Using Deterministic Chaos, Jonathan Blackledge, Nikolai Ptitsyn

Articles

The concepts of randomness, unpredictability, complexity and entropy form the basis of modern cryptography and a cryptosystem can be interpreted as the design of a key-dependent bijective transformation that is unpredictable to an observer for a given computational resource. For any cryptosystem, including a Pseudo-Random Number Generator (PRNG), encryption algorithm or a key exchange scheme, for example, a cryptanalyst has access to the time series of a dynamic system and knows the PRNG function (the algorithm that is assumed to be based on some iterative process) which is taken to be in the public domain by virtue of the Kerchhoff-Shannon …


A New Perspective On Visual Word Processing Efficiency, Joseph W. Houpt, James T. Townsend Jan 2010

A New Perspective On Visual Word Processing Efficiency, Joseph W. Houpt, James T. Townsend

Psychology Faculty Publications

As a fundamental part of our daily lives, visual word processing has received much attention in the psychological literature. Despite the well established perceptual advantages of word and pseudoword context using accuracy, a comparable effect using response times has been elusive. Some researchers continue to question whether the advantage due to word context is perceptual. We use the capacity coefficient, a well established, response time based measure of efficiency to provide evidence of word processing as a particularly efficient perceptual process to complement those results from the accuracy domain.


Probability Models For Blackjack Poker, Charlie H. Cooke Jan 2010

Probability Models For Blackjack Poker, Charlie H. Cooke

Mathematics & Statistics Faculty Publications

For simplicity in calculation, previous analyses of blackjack poker have employed models which employ sampling with replacement. in order to assess what degree of error this may induce, the purpose here is to calculate results for a typical hand where sampling without replacement is employed. It is seen that significant error can result when long runs are required to complete the hand. The hand examined is itself of particular interest, as regards both its outstanding expectations of high yield and certain implications for pair splitting of two nines against the dealer's seven. Theoretical and experimental methods are used in order …


The Joint Distribution Of Bivariate Exponential Under Linearly Related Model, Norou Diawara, Kumer Pial Das Jan 2010

The Joint Distribution Of Bivariate Exponential Under Linearly Related Model, Norou Diawara, Kumer Pial Das

Mathematics & Statistics Faculty Publications

In this paper, fundamental results of the joint distribution of the bivariate exponential distributions are established. The positive support multivariate distribution theory is important in reliability and survival analysis, and we applied it to the case where more than one failure or survival is observed in a given study. Usually, the multivariate distribution is restricted to those with marginal distributions of a specified and familiar lifetime family. The family of exponential distribution contains the absolutely continuous and discrete case models with a nonzero probability on a set of measure zero. Examples are given, and estimators are developed and applied to …


Linear Dependency For The Difference In Exponential Regression, Indika Sathish, Norou Diawara Jan 2010

Linear Dependency For The Difference In Exponential Regression, Indika Sathish, Norou Diawara

Mathematics & Statistics Faculty Publications

In the field of reliability, a lot has been written on the analysis of phenomena that are related. Estimation of the difference of two population means have been mostly formulated under the no-correlation assumption. However, in many situations, there is a correlation involved. This paper addresses this issue. A sequential estimation method for linearly related lifetime distributions is presented. Estimations for the scale parameters of the exponential distribution are given under square error loss using a sequential prediction method. Optimal stopping rules are discussed using concepts of mean criteria, and numerical results are presented.