Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

1,358 Full-Text Articles 2,008 Authors 853,222 Downloads 156 Institutions

All Articles in Statistical Models

Faceted Search

1,358 full-text articles. Page 46 of 53.

Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do 2012 The University of Texas MD Anderson Cancer Center

Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do

Jeffrey S. Morris

Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.

Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …


Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary KWUN CHUEN Chan, Ying Qing Chen, Chongzhi Di 2012 University of Washington

Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di

Chongzhi Di

To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …


Comparing The Cohort Design And The Nested Case-Control Design In The Presence Of Both Time-Invariant And Time-Dependent Treatment And Competing Risks: Bias And Precision, Peter C. Austin 2012 Institute for Clinical Evaluative Sciences

Comparing The Cohort Design And The Nested Case-Control Design In The Presence Of Both Time-Invariant And Time-Dependent Treatment And Competing Risks: Bias And Precision, Peter C. Austin

Peter Austin

Purpose: Observational studies using electronic administrative health care databases are often used to estimate the effects of treatments and exposures. Traditionally, a cohort design has been used to estimate these effects, but increasingly studies are using a nested case-control (NCC) design. The relative statistical efficiency of these two designs has not been examined in detail.

Methods: We used Monte Carlo simulations to compare these two designs in terms of the bias and precision of effect estimates. We examined three different settings: (A): treatment occurred at baseline and there was a single outcome of interest; (B): treatment was time-varying and there …


Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin 2012 Institute for Clinical Evaluative Sciences

Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin

Peter Austin

Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one requires the potential outcome under each possible treatment. However, only the outcome under the actual treatment received is observed, whereas the potential outcomes under the other treatments are considered missing data. Some authors have proposed that parametric regression models be used to estimate potential outcomes. In this study, we examined the use of …


Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin 2012 Institute for Clinical Evaluative Sciences

Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin

Peter Austin

In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1991-2001 and …


Generating Survival Times To Simulate Cox Proportional Hazards Models With Time-Varying Covariates., Peter C. Austin 2012 Institute for Clinical Evaluative Sciences

Generating Survival Times To Simulate Cox Proportional Hazards Models With Time-Varying Covariates., Peter C. Austin

Peter Austin

Simulations and Monte Carlo methods serve an important role in modern statistical research. They allow for an examination of the performance of statistical procedures in settings in which analytic and mathematical derivations may not be feasible. A key element in any statistical simulation is the existence of an appropriate data-generating process: one must be able to simulate data from a specified statistical model. We describe data-generating processes for the Cox proportional hazards model with time-varying covariates when event times follow an exponential, Weibull, or Gompertz distribution. We consider three types of time-varying covariates: first, a dichotomous time-varying covariate that can …


The Quotient Of The Beta-Weibull Distribution, Nonhle Channon Mdziniso 2012 Marshall University

The Quotient Of The Beta-Weibull Distribution, Nonhle Channon Mdziniso

Theses, Dissertations and Capstones

A new class of distributions recently developed involves the logit of the beta distribution. Among this class of distributions are, the beta-Normal (Eugene et al. [15]); beta-Gumbel (Nadarajah and Kotz [18]); beta-Exponential (Nadarajah and Kotz [19]); beta-Weibull (Famoye et al. [6]); beta-Rayleigh (Akinsete and Lowe [3]); beta-Laplace (Kozubowshi and Nadarajah [20]); and beta-Pareto (Akinsete et al. [4]), among a few others. Many useful statistical properties arising from these distributions and their applications to real life data have been discussed in literature. One approach by which a new statistical distribution is generated is by the transformation of random variables having known …


Grts And Graphs: Monitoring Natural Resources In Urban Landscapes, Todd R. Lookingbill, John Paul Schmit, Shawn L. Carter 2012 University of Richmond

Grts And Graphs: Monitoring Natural Resources In Urban Landscapes, Todd R. Lookingbill, John Paul Schmit, Shawn L. Carter

Geography and the Environment Faculty Publications

Environmental monitoring programs are an important tool for providing land managers with a scientific basis for management decisions. However, many ecological processes operate on spatial scales that transcend management boundaries (Schonewald-Cox 1988). For example, adjacent lands may influence protected-area resources via edge effects, source-sink dynamics, or invasion processes (Jones et al. 2009). Hydrologic alterations outside management units also may have profound effects on the integrity of resources being managed (Pringle 2000). The impacts of climate change are presenting challenges to resource management at local-to-global scales (Karl et al. 2009). This potential disparity between ecological and political boundaries presents an interesting …


General Recognition Theory Extended To Include Response Times: Predictions For A Class Of Parallel Systems, James T. Townsend, Joseph W. Houpt, Noah H. Silbert 2012 Wright State University - Main Campus

General Recognition Theory Extended To Include Response Times: Predictions For A Class Of Parallel Systems, James T. Townsend, Joseph W. Houpt, Noah H. Silbert

Psychology Faculty Publications

General Recognition Theory (GRT; Ashby & Townsend, 1986) is a multidimensional theory of classification. Originally developed to study various types of perceptual independence, it has also been widely employed in diverse cognitive venues, such as categorization. The initial theory and applications have been static, that is, lacking a time variable and focusing on patterns of responses, such as confusion matrices. Ashby proposed a parallel, dynamic stochastic version of GRT with application to perceptual independence based on discrete linear systems theory with imposed noise (Ashby, 1989). The current study again focuses on cognitive/perceptual independence within an identification classification paradigm. We extend …


Alternatives To Mixture Model Analysis Of Correlated Binomial Data, N. Rao Chaganty, Roy Sabo, Yihao Deng 2012 Old Dominion University

Alternatives To Mixture Model Analysis Of Correlated Binomial Data, N. Rao Chaganty, Roy Sabo, Yihao Deng

Mathematics & Statistics Faculty Publications

While univariate instances of binomial data are readily handled with generalized linear models, cases of multivariate or repeated measure binomial data are complicated by the possibility of correlated responses. Likelihood-based estimation can be applied by using mixture distribution models, though this approach can present computational challenges. The logistic transformation can be used to bypass these concerns and allow for alternative estimating procedures. One popular alternative is the generalized estimating equation (GEE) method, though systematic errors can lead to infeasible correlation estimates or nonconvergence problems. Our approach is the coupling of quasileast squares (QLSs) method with a rarely used matrix factorization, …


Analysis Of Discrete Choice Probit Models With Structured Correlation Matrices, Bhaskara Ravi 2012 Old Dominion University

Analysis Of Discrete Choice Probit Models With Structured Correlation Matrices, Bhaskara Ravi

Mathematics & Statistics Theses & Dissertations

Discrete choice models are very popular in Economics and the conditional logit model is the most widely used model to analyze consumer choice behavior, which was introduced in a seminal paper by McFadden (1974). This model is based on the assumption that the unobserved factors, which determine the consumer choices, are independent and follow a Gumbel distribution, widely known as the Independence of irrelevant Alternatives (IIA) assumption. Alternate models that relax IIA assumption are the Generalized Extreme Value (GEV) models, which allow dependency between unobserved factors. However, GEV models do not incorporate all dependency patterns, other choice behaviors such as …


Analysis Of Binary Data Via Spatial-Temporal Autologistic Regression Models, Zilong Wang 2012 University of Kentucky

Analysis Of Binary Data Via Spatial-Temporal Autologistic Regression Models, Zilong Wang

Theses and Dissertations--Statistics

Spatial-temporal autologistic models are useful models for binary data that are measured repeatedly over time on a spatial lattice. They can account for effects of potential covariates and spatial-temporal statistical dependence among the data. However, the traditional parametrization of spatial-temporal autologistic model presents difficulties in interpreting model parameters across varying levels of statistical dependence, where its non-negative autocovariates could bias the realizations toward 1. In order to achieve interpretable parameters, a centered spatial-temporal autologistic regression model has been developed. Two efficient statistical inference approaches, expectation-maximization pseudo-likelihood approach (EMPL) and Monte Carlo expectation-maximization likelihood approach (MCEML), have been proposed. Also, Bayesian …


A Comparative Analysis Of Decision Trees Vis-À-Vis Other Computational Data Mining Techniques In Automotive Insurance Fraud Detection, Adrian Gepp, Kuldeep Kumar, J Holton Wilson, Sukanto Bhattacharya 2011 Bond University

A Comparative Analysis Of Decision Trees Vis-À-Vis Other Computational Data Mining Techniques In Automotive Insurance Fraud Detection, Adrian Gepp, Kuldeep Kumar, J Holton Wilson, Sukanto Bhattacharya

Adrian Gepp

No abstract provided.


Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn 2011 Melbourne Business School

Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn

Michael Stanley Smith

[THIS IS AN AUGUST 2010 REVISION THAT REPLACES ALL PREVIOUS VERSIONS.]

We construct a copula from the skew t distribution of Sahu, Dey & Branco (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discrete-valued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose …


Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled 2011 Melbourne Business School

Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled

Michael Stanley Smith

Estimation of copula models with discrete margins is known to be difficult beyond the bivariate case. We show how this can be achieved by augmenting the likelihood with latent variables, and computing inference using the resulting augmented posterior. To evaluate this we propose two efficient Markov chain Monte Carlo sampling schemes. One generates the latent variables as a block using a Metropolis-Hasting step with a proposal that is close to its target distribution, the other generates them one at a time. Our method applies to all parametric copulas where the conditional copula functions can be evaluated, not just elliptical copulas …


Risk, Odds, And Their Ratios, Joseph Hilbe 2011 Arizona State University

Risk, Odds, And Their Ratios, Joseph Hilbe

Joseph M Hilbe

A brief monograph explaining the meaning of the terms, risk, risk ratio, odds, and odds ratio and how to calculate each, together with standard errors and confidence intervals. Stata code is provided showing how all of the terms can be calculated by hand, as well as by using logistic and Poisson models.


Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng 2011 Johns Hopkins University

Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng

Johns Hopkins University, Dept. of Biostatistics Working Papers

No abstract provided.


Water Quality Models For Stormwater Runoff In Two Lincoln, Nebraska Urban Watersheds, Jake Fisher 2011 University of Nebraska-Lincoln

Water Quality Models For Stormwater Runoff In Two Lincoln, Nebraska Urban Watersheds, Jake Fisher

Department of Civil and Environmental Engineering: Dissertations, Theses, and Student Research

Water quality monitoring was conducted in two urban watersheds (Colonial Hills and Taylor Park) located in southeast Lincoln, NE over a three year period spanning from October 2008 through September 2011. In-line probes continuously measured for turbidity, conductivity, dissolved oxygen, and water temperature while other water quality constituents were analyzed for discrete water samples collected using grab and automatic sampling techniques. The water quality data was used to calculate event mean concentrations (EMCs) for sixteen storm events sampled over the duration of the project period. Three types of stormwater quality multiple linear regression models were developed for the estimation of …


Development Of A Bayesian Joint Logistic Model To Better Study The Association Between Haplotypes And Disease, Anthony M. D'Amelio Jr 2011 The University of Texas Graduate School of Biomedical Sciences at Houston

Development Of A Bayesian Joint Logistic Model To Better Study The Association Between Haplotypes And Disease, Anthony M. D'Amelio Jr

Dissertations & Theses (Open Access)

In 2011, there will be an estimated 1,596,670 new cancer cases and 571,950 cancer-related deaths in the US. With the ever-increasing applications of cancer genetics in epidemiology, there is great potential to identify genetic risk factors that would help identify individuals with increased genetic susceptibility to cancer, which could be used to develop interventions or targeted therapies that could hopefully reduce cancer risk and mortality.

In this dissertation, I propose to develop a new statistical method to evaluate the role of haplotypes in cancer susceptibility and development. This model will be flexible enough to handle not only haplotypes of any …


Real Options Models In Real Estate, Jin Won Choi 2011 The University of Western Ontario

Real Options Models In Real Estate, Jin Won Choi

Electronic Thesis and Dissertation Repository

Our aim in this thesis is to investigate the usefulness of real options analysis, taking case studies of problems in real estate. In the realm of real estate, we consider the following three problems. First, we consider the valuation and usefulness of presale contracts of condominiums, which can be viewed as similar to call options on condominiums. Secondly, we consider the valuation of farm land from the perspective of land developers, who may think of farm land as being similar to call options on subdivision lots. Third, we consider the valuation of opportunities to install solar panels on properties, in …


Digital Commons powered by bepress