Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Applied Statistics (60)
- Biostatistics (34)
- Statistical Methodology (23)
- Longitudinal Data Analysis and Time Series (19)
- Medicine and Health Sciences (12)
-
- Multivariate Analysis (11)
- Social and Behavioral Sciences (11)
- Statistical Theory (11)
- Computational Biology (10)
- Genetics and Genomics (10)
- Genomics (10)
- Life Sciences (10)
- Bioinformatics (9)
- Categorical Data Analysis (9)
- Biometry (8)
- Medical Biomathematics and Biometrics (8)
- Medical Sciences (8)
- Microarrays (8)
- Economics (7)
- Econometrics (6)
- Behavior and Behavior Mechanisms (4)
- Health Psychology (4)
- Mental and Social Health (4)
- Other Mental and Social Health (4)
- Psychiatry and Psychology (4)
- Psychology (4)
- Public Health (4)
- Keyword
-
- Statistical Models (9)
- Functional Data Analysis (8)
- Propensity score methods (7)
- Statistical Methodology (7)
- Genomics (6)
-
- Statistical Programming (6)
- Negative Binomial Regression 2nd edition (5)
- Propensity score (5)
- Statistical Theory and Methods (5)
- General Biostatistics (4)
- Image Analysis (4)
- Logistic Regression Models (4)
- Proteomics (4)
- Bias (3)
- Econometrics (3)
- Observational studies (3)
- Quantitative methods in the social sciences (3)
- Regression methods (3)
- Statistical Software Review (3)
- Beta (2)
- Biomarkers (2)
- Causal inference (2)
- Confounding (2)
- Data mining (2)
- Functional Mixed Models (2)
- Functional data analysis (2)
- Generalized Linear Models and Extensions 3rd Edition (2)
- Logistic (2)
- Logistic regression (2)
- Measurement (2)
- Publication Year
- Publication
Articles 1 - 30 of 75
Full-Text Articles in Statistical Models
Addition To Pglr Chap 6, Joseph M. Hilbe
Addition To Pglr Chap 6, Joseph M. Hilbe
Joseph M Hilbe
Addition to Chapter 6 in Practical Guide to Logistic Regression. Added section on Bayesian logistic regression using Stata.
Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson
Online Variational Bayes Inference For High-Dimensional Correlated Data, Sylvie T. Kabisa, Jeffrey S. Morris, David Dunson
Jeffrey S. Morris
High-dimensional data with hundreds of thousands of observations are becoming commonplace in many disciplines. The analysis of such data poses many computational challenges, especially when the observations are correlated over time and/or across space. In this paper we propose exible hierarchical regression models for analyzing such data that accommodate serial and/or spatial correlation. We address the computational challenges involved in fitting these models by adopting an approximate inference framework. We develop an online variational Bayes algorithm that works by incrementally reading the data into memory one portion at a time. The performance of the method is assessed through simulation studies. …
Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris
Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris
Jeffrey S. Morris
We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on …
Negative Binomial Regerssion, 2nd Ed, 2nd Print, Errata And Comments, Joseph Hilbe
Negative Binomial Regerssion, 2nd Ed, 2nd Print, Errata And Comments, Joseph Hilbe
Joseph M Hilbe
Errata and Comments for 2nd printing of NBR2, 2nd edition. Previous errata from first printing all corrected. Some added and new text as well.
Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris
Bayesian Function-On-Function Regression For Multi-Level Functional Data, Mark J. Meyer, Brent A. Coull, Francesco Versace, Paul Cinciripini, Jeffrey S. Morris
Jeffrey S. Morris
Medical and public health research increasingly involves the collection of more and more complex and high dimensional data. In particular, functional data|where the unit of observation is a curve or set of curves that are finely sampled over a grid -- is frequently obtained. Moreover, researchers often sample multiple curves per person resulting in repeated functional measures. A common question is how to analyze the relationship between two functional variables. We propose a general function-on-function regression model for repeatedly sampled functional data, presenting a simple model as well as a more extensive mixed model framework, along with multiple functional posterior …
Functional Regression, Jeffrey S. Morris
Functional Regression, Jeffrey S. Morris
Jeffrey S. Morris
Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and …
Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull
Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull
Jeffrey S. Morris
Current methods for conducting expression Quantitative Trait Loci (eQTL) analysis are limited in scope to a pairwise association testing between a single nucleotide polymorphism (SNPs) and expression probe set in a region around a gene of interest, thus ignoring the inherent between-SNP correlation. To determine association, p-values are then typically adjusted using Plug-in False Discovery Rate. As many SNPs are interrogated in the region and multiple probe-sets taken, the current approach requires the fitting of a large number of models. We propose to remedy this by introducing a flexible function-on-scalar regression that models the genome as a functional outcome. The …
Asimmetria Del Rischio Sistematico Dei Titolo Immobiliari Americani: Nuove Evidenze Econometriche, Paola De Santis, Carlo Drago
Asimmetria Del Rischio Sistematico Dei Titolo Immobiliari Americani: Nuove Evidenze Econometriche, Paola De Santis, Carlo Drago
Carlo Drago
In questo lavoro riscontriamo un aumento del rischio sistematico dei titoli del mercato immobiliare americano nell’anno 2007 seguito da un ritorno ai valori iniziali nell’anno 2009 e si evidenzia la possibile presenza di break strutturali. Per valutare il suddetto rischio sistematico è stato scelto il modello a tre fattori di Fama e French ed è stata studiata la relazione tra l’extra rendimento dell’indice REIT, utilizzato come proxy dell’andamento dei titoli immobiliari americani, e l’extra rendimento dell’indice S&P500 rappresentativo del rendimento del portafoglio di mercato. I risultati confermano la presenza di un “Asymmetric REIT Beta Puzzle” coerentemente con alcuni precedenti studi …
Errata - Logistic Regression Models, Joseph Hilbe
Errata - Logistic Regression Models, Joseph Hilbe
Joseph M Hilbe
Errata for Logistic Regression Models, 4th Printing
Interpretation And Prediction Of A Logistic Model, Joseph M. Hilbe
Interpretation And Prediction Of A Logistic Model, Joseph M. Hilbe
Joseph M Hilbe
A basic overview of how to model and interpret a logistic regression model, as well as how to obtain the predicted probability or fit of the model and calculate its confidence intervals. R code used for all examples; some Stata is provided as a contrast.
An Asymptotically Minimax Kernel Machine, Debashis Ghosh
An Asymptotically Minimax Kernel Machine, Debashis Ghosh
Debashis Ghosh
Recently, a class of machine learning-inspired procedures, termed kernel machine methods, has been extensively developed in the statistical literature. It has been shown to have large power for a wide class of problems and applications in genomics and brain imaging. Many authors have exploited an equivalence between kernel machines and mixed eects models and used attendant estimation and inferential procedures. In this note, we construct a so-called `adaptively minimax' kernel machine. Such a construction highlights the role of thresholding in the observation space and limits on the interpretability of such kernel machines.
On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang
On Likelihood Ratio Tests When Nuisance Parameters Are Present Only Under The Alternative, Cz Di, K-Y Liang
Chongzhi Di
In parametric models, when one or more parameters disappear under the null hypothesis, the likelihood ratio test statistic does not converge to chi-square distributions. Rather, its limiting distribution is shown to be equivalent to that of the supremum of a squared Gaussian process. However, the limiting distribution is analytically intractable for most of examples, and approximation or simulation based methods must be used to calculate the p values. In this article, we investigate conditions under which the asymptotic distributions have analytically tractable forms, based on the principal component decomposition of Gaussian processes. When these conditions are not satisfied, the principal …
Beta Binomial Regression, Joseph M. Hilbe
Beta Binomial Regression, Joseph M. Hilbe
Joseph M Hilbe
Monograph on how to construct, interpret and evaluate beta, beta binomial, and zero inflated beta-binomial regression models. Stata and R code used for examples.
諸外国における最新のデータエディティング事情~混淆正規分布モデルによる多変量外れ値検出法の検証~(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi
諸外国における最新のデータエディティング事情~混淆正規分布モデルによる多変量外れ値検出法の検証~(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi
Masayoshi Takahashi
No abstract provided.
Caimans - Semantic Platform For Advance Content Mining (Sketch Wp), Salvo Reina
Caimans - Semantic Platform For Advance Content Mining (Sketch Wp), Salvo Reina
Salvo Reina
A middleware SW platform was created for automatic classification of textual contents. The worksheet of requirements and the original flow-sketchs are published.
Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi
Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi
Jeffrey S. Morris
Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.
Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.
Design: A single-arm, phase II trial.
Patients: Twenty-seven patients with FAP.
Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.
Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were …
Bayesian Nonparametric Regression And Density Estimation Using Integrated Nested Laplace Approximations, Xiaofeng Wang
Bayesian Nonparametric Regression And Density Estimation Using Integrated Nested Laplace Approximations, Xiaofeng Wang
Xiaofeng Wang
Integrated nested Laplace approximations (INLA) are a recently proposed approximate Bayesian approach to fit structured additive regression models with latent Gaussian field. INLA method, as an alternative to Markov chain Monte Carlo techniques, provides accurate approximations to estimate posterior marginals and avoid time-consuming sampling. We show here that two classical nonparametric smoothing problems, nonparametric regression and density estimation, can be achieved using INLA. Simulated examples and \texttt{R} functions are demonstrated to illustrate the use of the methods. Some discussions on potential applications of INLA are made in the paper.
Using Methods From The Data-Mining And Machine-Learning Literature For Disease Classification And Prediction: A Case Study Examining Classification Of Heart Failure Subtypes, Peter C. Austin
Peter Austin
OBJECTIVE: Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine-learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines.
STUDY DESIGN AND SETTING: We compared the performance of these classification methods with that of conventional classification trees to classify patients with heart failure (HF) …
Predictive Accuracy Of Risk Factors And Markers: A Simulation Study Of The Effect Of Novel Markers On Different Performance Measures For Logistic Regression Models, Peter C. Austin
Peter Austin
The change in c-statistic is frequently used to summarize the change in predictive accuracy when a novel risk factor is added to an existing logistic regression model. We explored the relationship between the absolute change in the c-statistic, Brier score, generalized R(2) , and the discrimination slope when a risk factor was added to an existing model in an extensive set of Monte Carlo simulations. The increase in model accuracy due to the inclusion of a novel marker was proportional to both the prevalence of the marker and to the odds ratio relating the marker to the outcome but inversely …
Nbr2 Errata And Comments, Joseph Hilbe
Nbr2 Errata And Comments, Joseph Hilbe
Joseph M Hilbe
Errata and Comments for Negative Binomial Regression, 2nd edition
International Astrostatistics Association, Joseph Hilbe
International Astrostatistics Association, Joseph Hilbe
Joseph M Hilbe
Overview of the history, purpose, Council and officers of the International Astrostatistics Association (IAA)
諸外国のデータエディティング及び混淆正規分布モデルによる多変量外れ値検出法についての研究(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi
諸外国のデータエディティング及び混淆正規分布モデルによる多変量外れ値検出法についての研究(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi
Masayoshi Takahashi
No abstract provided.
Glme3_Ado_Do_Files, Joseph Hilbe
Glme3 Data And Adodo Files, Joseph Hilbe
Glme3 Data And Adodo Files, Joseph Hilbe
Joseph M Hilbe
A listing of Data Sets and Stata software commands and do files in GLME3 book
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Statistical Methods For Proteomic Biomarker Discovery Based On Feature Extraction Or Functional Modeling Approaches, Jeffrey S. Morris
Jeffrey S. Morris
In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational …
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Integrative Bayesian Analysis Of High-Dimensional Multi-Platform Genomics Data, Wenting Wang, Veerabhadran Baladandayuthapani, Jeffrey S. Morris, Bradley M. Broom, Ganiraju C. Manyam, Kim-Anh Do
Jeffrey S. Morris
Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current integration approaches that treat the data are limited in that they do not consider the fundamental biological relationships that exist among the data from platforms.
Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses a hierarchical modeling technique to combine the data obtained from multiple platforms …
Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di
Proportional Mean Residual Life Model For Right-Censored Length-Biased Data, Gary Kwun Chuen Chan, Ying Qing Chen, Chongzhi Di
Chongzhi Di
To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length-biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to the so-called “induced informative censoring” in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes and Dasu (1990) for analysis of censored length-biased survival data. Several nonstandard data structures, …
Comparing The Cohort Design And The Nested Case-Control Design In The Presence Of Both Time-Invariant And Time-Dependent Treatment And Competing Risks: Bias And Precision, Peter C. Austin
Peter Austin
Purpose: Observational studies using electronic administrative health care databases are often used to estimate the effects of treatments and exposures. Traditionally, a cohort design has been used to estimate these effects, but increasingly studies are using a nested case-control (NCC) design. The relative statistical efficiency of these two designs has not been examined in detail.
Methods: We used Monte Carlo simulations to compare these two designs in terms of the bias and precision of effect estimates. We examined three different settings: (A): treatment occurred at baseline and there was a single outcome of interest; (B): treatment was time-varying and there …
Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin
Using Ensemble-Based Methods For Directly Estimating Causal Effects: An Investigation Of Tree-Based G-Computation, Peter C. Austin
Peter Austin
Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one requires the potential outcome under each possible treatment. However, only the outcome under the actual treatment received is observed, whereas the potential outcomes under the other treatments are considered missing data. Some authors have proposed that parametric regression models be used to estimate potential outcomes. In this study, we examined the use of …
Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin
Regression Trees For Predicting Mortality In Patients With Cardiovascular Disease: What Improvement Is Achieved By Using Ensemble-Based Methods?, Peter C. Austin
Peter Austin
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1991-2001 and …