Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Psychology (4)
- Statistical Models (4)
- General Biostatistics (3)
- Bayesian Model Averaging and Semiparametric Regression (2)
- Bayesian methods (2)
-
- Biomarkers (2)
- Copula Modeling (2)
- Fraud Detection (2)
- Generalized Linear Models and Extensions 3rd Edition (2)
- Proteomics (2)
- Regression methods (2)
- 2D Gel Electrophoresis (1)
- ARMA and ARIMA models (1)
- Applied statistics (1)
- Archimedian Copula (1)
- Architecture (1)
- Astrostatistics (1)
- Asymmetric Dependence (1)
- Bagging (1)
- Bayesian Cross-Validation (1)
- Bayesian Modeling (1)
- Bayesian Pair-Copula Selection (1)
- Bayesian approaches (1)
- Bayesian hierarchical model (1)
- Bayesian models (1)
- Bias (1)
- Boosted regression trees (1)
- Boosting (1)
- Case-control design (1)
- Causal inference (1)
- Publication
- File Type
Articles 1 - 24 of 24
Full-Text Articles in Physical Sciences and Mathematics
A Comparative Analysis Of Decision Trees Vis-À-Vis Other Computational Data Mining Techniques In Automotive Insurance Fraud Detection, Adrian Gepp, Kuldeep Kumar, J Holton Wilson, Sukanto Bhattacharya
A Comparative Analysis Of Decision Trees Vis-À-Vis Other Computational Data Mining Techniques In Automotive Insurance Fraud Detection, Adrian Gepp, Kuldeep Kumar, J Holton Wilson, Sukanto Bhattacharya
Kuldeep Kumar
No abstract provided.
Nbr2 Errata And Comments, Joseph Hilbe
Nbr2 Errata And Comments, Joseph Hilbe
Joseph M Hilbe
Errata and Comments for Negative Binomial Regression, 2nd edition
Time Series, Unit Roots, And Cointegration: An Introduction, Lonnie K. Stevans
Time Series, Unit Roots, And Cointegration: An Introduction, Lonnie K. Stevans
Lonnie K. Stevans
The econometric literature on unit roots took off after the publication of the paper by Nelson and Plosser (1982) that argued that most macroeconomic series have unit roots and that this is important for the analysis of macroeconomic policy. Yule (1926) suggested that regressions based on trending time series data can be spurious. This problem of spurious correlation was further pursued by Granger and Newbold (1974) and this also led to the development of the concept of cointegration (lack of cointegration implies spurious regression). The pathbreaking paper by Granger (1981), first presented at a conference at the University of Florida …
Capacity Coefficient Variations, Joseph W. Houpt, Andrew Heathcote, Ami Eidels, Nathan Medeiros-Ward, Jason Watson, David Strayer
Capacity Coefficient Variations, Joseph W. Houpt, Andrew Heathcote, Ami Eidels, Nathan Medeiros-Ward, Jason Watson, David Strayer
Joseph W. Houpt
The capacity coefficient has become an increasingly popular measure of efficiency under changes in workload. It has been used in applications ranging from psychophysical detection tasks to complex cognitive tasks, as well as in addressing questions in social and clinical psychology. The basic formulation compares response times to each stimulus property (or task) in isolation to response times with all stimulus properties (or tasks) at the same time. A number of variations on the basic capacity coefficient have been used, both in the experimental design and in the calculations, and many more are possible. Here we outline the theoretical reasons …
General Recognition Theory Extended To Include Response Times: Predictions For A Class Of Parallel Systems, Joseph W. Houpt, James T. Townsend, Noah H. Silbert
General Recognition Theory Extended To Include Response Times: Predictions For A Class Of Parallel Systems, Joseph W. Houpt, James T. Townsend, Noah H. Silbert
Joseph W. Houpt
No abstract provided.
International Astrostatistics Association, Joseph Hilbe
International Astrostatistics Association, Joseph Hilbe
Joseph M Hilbe
Overview of the history, purpose, Council and officers of the International Astrostatistics Association (IAA)
諸外国のデータエディティング及び混淆正規分布モデルによる多変量外れ値検出法についての研究(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi
諸外国のデータエディティング及び混淆正規分布モデルによる多変量外れ値検出法についての研究(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi
Masayoshi Takahashi
No abstract provided.
Big Data And The Future, Sherri Rose
Bayesian Approaches To Assessing Architecture And Stopping Rule, Joseph W. Houpt, A. Heathcote, A. Eidels, J. T. Townsend
Bayesian Approaches To Assessing Architecture And Stopping Rule, Joseph W. Houpt, A. Heathcote, A. Eidels, J. T. Townsend
Joseph W. Houpt
Much of scientific psychology and cognitive science can be viewed as a search to understand the mechanisms and dynamics of perception, thought and action. Two processing attributes of particular interest to psychologists are the architecture, or temporal relationships between sub-processes of the system, and the stopping rule, which dictates how many of the sub-processes must be completed for the system to finish. The Survivor Interaction Contrast (SIC) is a powerful tool for assessing the architecture and stopping rule of a mental process model. Thus far, statistical analysis of the SIC has been limited to null-hypothesis- significance tests. In this talk …
Glme3_Ado_Do_Files, Joseph Hilbe
Glme3 Data And Adodo Files, Joseph Hilbe
Glme3 Data And Adodo Files, Joseph Hilbe
Joseph M Hilbe
A listing of Data Sets and Stata software commands and do files in GLME3 book
Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway
Loss Function Based Ranking In Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway
Rongheng Lin
Several authors have studied the performance of optimal, squared error loss (SEL) estimated ranks. Though these are effective, in many applications interest focuses on identifying the relatively good (e.g., in the upper 10%) or relatively poor performers. We construct loss functions that address this goal and evaluate candidate rank estimates, some of which optimize specific loss functions. We study performance for a fully parametric hierarchical model with a Gaussian prior and Gaussian sampling distributions, evaluating performance for several loss functions. Results show that though SEL-optimal ranks and percentiles do not specifically focus on classifying with respect to a percentile cut …
Ranking Usrds Provider-Specific Smrs From 1998-2001, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway
Ranking Usrds Provider-Specific Smrs From 1998-2001, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway
Rongheng Lin
Provider profiling (ranking, "league tables") is prevalent in health services research. Similarly, comparing educational institutions and identifying differentially expressed genes depend on ranking. Effective ranking procedures must be structured by a hierarchical (Bayesian) model and guided by a ranking-specific loss function, however even optimal methods can perform poorly and estimates must be accompanied by uncertainty assessments. We use the 1998-2001 Standardized Mortality Ratio (SMR) data from United States Renal Data System (USRDS) as a platform to identify issues and approaches. Our analyses extend Liu et al. (2004) by combining evidence over multiple years via an AR(1) model; by considering estimates …
General Recognition Theory Extended To Include Response Times: Predictions For A Class Of Parallel Systems, James T. Townsend, Joseph W. Houpt, Noah H. Silbert
General Recognition Theory Extended To Include Response Times: Predictions For A Class Of Parallel Systems, James T. Townsend, Joseph W. Houpt, Noah H. Silbert
Joseph W. Houpt
General Recognition Theory (GRT; Ashby & Townsend, 1986) is a multidimensional theory of classification. Originally developed to study various types of perceptual independence, it has also been widely employed in diverse cognitive venues, such as categorization. The initial theory and applications have been static, that is, lacking a time variable and focusing on patterns of responses, such as confusion matrices. Ashby proposed a parallel, dynamic stochastic version of GRT with application to perceptual independence based on discrete linear systems theory with imposed noise \citep{Ash89}. The current study again focuses on cognitive/perceptual independence within an identification classification paradigm. We extend stochastic …
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Jeffrey S. Morris
In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational …
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Jeffrey S. Morris
Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.
Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …
Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di
Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di
Chongzhi Di
To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …
Comparing The Cohort Design And The Nested Case-Control Design In The Presence Of Both Time-Invariant And Time-Dependent Treatment And Competing Risks: Bias And Precision, Peter C. Austin
Peter Austin
Purpose: Observational studies using electronic administrative health care databases are often used to estimate the effects of treatments and exposures. Traditionally, a cohort design has been used to estimate these effects, but increasingly studies are using a nested case-control (NCC) design. The relative statistical efficiency of these two designs has not been examined in detail.
Methods: We used Monte Carlo simulations to compare these two designs in terms of the bias and precision of effect estimates. We examined three different settings: (A): treatment occurred at baseline and there was a single outcome of interest; (B): treatment was time-varying and there …
Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin
Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin
Peter Austin
Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one requires the potential outcome under each possible treatment. However, only the outcome under the actual treatment received is observed, whereas the potential outcomes under the other treatments are considered missing data. Some authors have proposed that parametric regression models be used to estimate potential outcomes. In this study, we examined the use of …
Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin
Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin
Peter Austin
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1991-2001 and …
Generating Survival Times To Simulate Cox Proportional Hazards Models With Time-Varying Covariates., Peter C. Austin
Generating Survival Times To Simulate Cox Proportional Hazards Models With Time-Varying Covariates., Peter C. Austin
Peter Austin
Simulations and Monte Carlo methods serve an important role in modern statistical research. They allow for an examination of the performance of statistical procedures in settings in which analytic and mathematical derivations may not be feasible. A key element in any statistical simulation is the existence of an appropriate data-generating process: one must be able to simulate data from a specified statistical model. We describe data-generating processes for the Cox proportional hazards model with time-varying covariates when event times follow an exponential, Weibull, or Gompertz distribution. We consider three types of time-varying covariates: first, a dichotomous time-varying covariate that can …
A Comparative Analysis Of Decision Trees Vis-À-Vis Other Computational Data Mining Techniques In Automotive Insurance Fraud Detection, Adrian Gepp, Kuldeep Kumar, J Holton Wilson, Sukanto Bhattacharya
A Comparative Analysis Of Decision Trees Vis-À-Vis Other Computational Data Mining Techniques In Automotive Insurance Fraud Detection, Adrian Gepp, Kuldeep Kumar, J Holton Wilson, Sukanto Bhattacharya
Adrian Gepp
No abstract provided.
Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn
Modeling Dependence Using Skew T Copulas: Bayesian Inference And Applications, Michael S. Smith, Quan Gan, Robert Kohn
Michael Stanley Smith
[THIS IS AN AUGUST 2010 REVISION THAT REPLACES ALL PREVIOUS VERSIONS.]
We construct a copula from the skew t distribution of Sahu, Dey & Branco (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discrete-valued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose …
Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled
Estimation Of Copula Models With Discrete Margins Via Bayesian Data Augmentation, Michael S. Smith, Mohamad A. Khaled
Michael Stanley Smith
Estimation of copula models with discrete margins is known to be difficult beyond the bivariate case. We show how this can be achieved by augmenting the likelihood with latent variables, and computing inference using the resulting augmented posterior. To evaluate this we propose two efficient Markov chain Monte Carlo sampling schemes. One generates the latent variables as a block using a Metropolis-Hasting step with a proposal that is close to its target distribution, the other generates them one at a time. Our method applies to all parametric copulas where the conditional copula functions can be evaluated, not just elliptical copulas …