Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

1,362 Full-Text Articles 2,012 Authors 853,222 Downloads 156 Institutions

All Articles in Statistical Models

Faceted Search

1,362 full-text articles. Page 43 of 53.

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. van der Laan 2013 UC Berkeley, School of Public Health-Division of Biostatistics

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. van der Laan 2013 University of California, Berkeley

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

Laura B. Balzer

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards 2013 University of Tennessee, Knoxville

Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards

Doctoral Dissertations

Many key decisions and design policies are made using sophisticated computer simulations. However, these sophisticated computer simulations have several major problems. The two main issues are 1) gaps between the simulation model and the actual structure, and 2) limitations of the modeling engine's capabilities. This dissertation's goal is to address these simulation deficiencies by presenting a general automated process for tuning simulation inputs such that simulation output matches real world measured data. The automated process involves the following key components -- 1) Identify a model that accurately estimates the real world simulation calibration target from measured sensor data; 2) Identify …


Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong 2013 The University of Texas Graduate School of Biomedical Sciences at Houston

Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong

Dissertations & Theses (Open Access)

It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays …


Does The Sat Predict Academic Achievement And Academic Choices At Macalester College?, Jing Wen 2013 Macalester College

Does The Sat Predict Academic Achievement And Academic Choices At Macalester College?, Jing Wen

Mathematics, Statistics, and Computer Science Honors Projects

This paper examines the predictive power of the Scholastic Aptitude Test (SAT) for Macalester students’ college success and academic choices. We use linear regression to study whether the SAT can predict students’ first year or four-year grades. Using Kullback-Leibler divergence and classification trees, we also examine the SAT’s predictive ability for other aspects of students’ academic experience, for example, major selection, or academic division of study. After controlling for major and course level, we find that the SAT does not explain a large proportion of the variability in Macalester students’ college success. However, the SAT does provide some useful information …


Modeling The Relationship Between Coal Mining And Respiratory Health In West Virginia, Jessica Welch 2013 University of Tennessee, Knoxville

Modeling The Relationship Between Coal Mining And Respiratory Health In West Virginia, Jessica Welch

Chancellor’s Honors Program Projects

No abstract provided.


Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon 2013 The University of Western Ontario

Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon

Electronic Thesis and Dissertation Repository

A time series often contains various systematic effects such as trends and seasonality. These different components can be determined and separated by decomposition methods. In this thesis, we discuss time series decomposition process using nonparametric regression. A method based on both loess and harmonic regression is suggested and an optimal model selection method is discussed. We then compare the process with seasonal-trend decomposition by loess STL (Cleveland, 1979). While STL works well when that proper parameters are used, the method we introduce is also competitive: it makes parameter choice more automatic and less complex. The decomposition process often requires that …


A New Diagnostic Test For Regression, Yun Shi 2013 The University of Western Ontario

A New Diagnostic Test For Regression, Yun Shi

Electronic Thesis and Dissertation Repository

A new diagnostic test for regression and generalized linear models is discussed. The test is based on testing if the residuals are close together in the linear space of one of the covariates are correlated. This is a generalization of the famous problem of spurious correlation in time series regression. A full model building approach for the case of regression was developed in Mahdi (2011, Ph.D. Thesis, Western University, ”Diagnostic Checking, Time Series and Regression”) using an iterative generalized least squares algorithm. Simulation experiments were reported that demonstrate the validity and utility of this approach but no actual applications were …


Economics And Attitude: The Effects Of Happiness On Economic Development, Chris Pace 2013 Stephen F Austin State University

Economics And Attitude: The Effects Of Happiness On Economic Development, Chris Pace

Undergraduate Research Conference

No abstract provided.


Is Obesity Socially Contagious?, Ciani Jean Sparks 2013 California Polytechnic State University, San Luis Obispo

Is Obesity Socially Contagious?, Ciani Jean Sparks

Statistics

The main objective of this paper is to analyze three different articles that discuss whether obesity could be socially contagious. According to the World Health Organization in 2013, obesity is the fifth leading risk for deaths around the world. This disease has dramatically increased in the last decade, which has led scientists to believe there are other factors contributing to the epidemic besides genetics. The first article I analyzed, written by Nicholas Christakis and James Fowler, provided a logistic regression model to estimate the odds of a person becoming obese. The model included the explanatory variables: age, sex, education, smoking …


A Bayesian Regression Tree Approach To Identify The Effect Of Nanoparticles Properties On Toxicity Profiles, Cecile Low-Kam, Haiyuan Zhang, Zhaoxia Ji, Tian Xia, Jeffrey I. Zinc, Andre Nel, Donatello Telesca 2013 UCLA Biostatistics - CNSI

A Bayesian Regression Tree Approach To Identify The Effect Of Nanoparticles Properties On Toxicity Profiles, Cecile Low-Kam, Haiyuan Zhang, Zhaoxia Ji, Tian Xia, Jeffrey I. Zinc, Andre Nel, Donatello Telesca

COBRA Preprint Series

We introduce a Bayesian multiple regression tree model to characterize relationships between physico-chemical properties of nanoparticles and their in-vitro toxicity over multiple doses and times of exposure. Unlike conventional models that rely on data summaries, our model solves the low sample size issue and avoids arbitrary loss of information by combining all measurements from a general exposure experiment across doses, times of exposure, and replicates. The proposed technique integrates Bayesian trees for modeling threshold effects and interactions, and penalized B-splines for dose and time-response surfaces smoothing. The resulting posterior distribution is sampled via a Markov Chain Monte Carlo algorithm. This …


Influenza Forecasting With Google Flu Trends, Andrea Freyer Dugas, Mehdi Jalalpour, Yulia Gel, Scott Levin, Fred Torcaso, Takeru Igusa, Richard E. Rothman 2013 Johns Hopkins University

Influenza Forecasting With Google Flu Trends, Andrea Freyer Dugas, Mehdi Jalalpour, Yulia Gel, Scott Levin, Fred Torcaso, Takeru Igusa, Richard E. Rothman

Civil and Environmental Engineering Faculty Publications

Background: We developed a practical influenza forecast model based on real-time, geographically focused, and easy to access data, designed to provide individual medical centers with advanced warning of the expected number of influenza cases, thus allowing for sufficient time to implement interventions. Secondly, we evaluated the effects of incorporating a real-time influenza surveillance system, Google Flu Trends, and meteorological and temporal information on forecast accuracy.

Methods: Forecast models designed to predict one week in advance were developed from weekly counts of confirmed influenza cases over seven seasons (2004–2011) divided into seven training and out-of-sample verification sets. Forecasting procedures …


Persistence And Anti-Persistence: Theory And Software, Justin Quinn Veenstra 2013 The University of Western Ontario

Persistence And Anti-Persistence: Theory And Software, Justin Quinn Veenstra

Electronic Thesis and Dissertation Repository

Persistent and anti-persistent time series processes show what is called hyperbolic decay. Such series play an important role in the study of many diverse areas such as geophysics and financial economics. They are also of theoretical interest. Fractional Gaussian noise (FGN) and fractionally-differeneced white noise are two widely known examples of time series models with hyperbolic decay. New closed form expressions are obtained for the spectral density functions of these models. Two lesser known time series models exhibiting hyperbolic decay are introduced and their basic properties are derived. A new algorithm for approximate likelihood estimation of the models using frequency …


Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi 2013 The University of Texas M.D. Anderson Cancer Center

Global Quantitative Assessment Of The Colorectal Polyp Burden In Familial Adenomatous Polyposis Using A Web-Based Tool, Patrick M. Lynch, Jeffrey S. Morris, William A. Ross, Miguel A. Rodriguez-Bigas, Juan Posadas, Rossa Khalaf, Diane M. Weber, Valerie O. Sepeda, Bernard Levin, Imad Shureiqi

Jeffrey S. Morris

Background: Accurate measures of the total polyp burden in familial adenomatous polyposis (FAP) are lacking. Current assessment tools include polyp quantitation in limited-field photographs and qualitative total colorectal polyp burden by video.

Objective: To develop global quantitative tools of the FAP colorectal adenoma burden.

Design: A single-arm, phase II trial.

Patients: Twenty-seven patients with FAP.

Intervention: Treatment with celecoxib for 6 months, with before-treatment and after-treatment videos posted to an intranet with an interactive site for scoring.

Main Outcome Measurements: Global adenoma counts and sizes (grouped into categories: less than 2 mm, 2-4 mm, and greater than 4 mm) were …


Systems Factorial Technology With R, Joseph W. Houpt, Leslie M. Blaha, John P. McIntire, Paul R. Havig, James T. Townsend 2013 Wright State University - Main Campus

Systems Factorial Technology With R, Joseph W. Houpt, Leslie M. Blaha, John P. Mcintire, Paul R. Havig, James T. Townsend

Joseph W. Houpt

Systems Factorial Technology (SFT) comprises a set of powerful nonparametric models and measures, together with a theory-driven experiment methodology termed the Double Factorial Paradigm (DFP), for assessing the cognitive information processing mechanisms supporting the processing of multiple sources of information in a given task. We provide an overview of the model-based measures of SFT together with a tutorial on designing a DFP experiment to take advantage of all SFT measures in a single experiment. Illustrative examples are given to highlight the breadth of applicability of these techniques across psychology. We further introduce and demonstrate a new package for performing SFT …


Bayesian Nonparametric Regression And Density Estimation Using Integrated Nested Laplace Approximations, Xiaofeng Wang 2013 Cleveland Clinic Lerner Research Institute

Bayesian Nonparametric Regression And Density Estimation Using Integrated Nested Laplace Approximations, Xiaofeng Wang

Xiaofeng Wang

Integrated nested Laplace approximations (INLA) are a recently proposed approximate Bayesian approach to fit structured additive regression models with latent Gaussian field. INLA method, as an alternative to Markov chain Monte Carlo techniques, provides accurate approximations to estimate posterior marginals and avoid time-consuming sampling. We show here that two classical nonparametric smoothing problems, nonparametric regression and density estimation, can be achieved using INLA. Simulated examples and \texttt{R} functions are demonstrated to illustrate the use of the methods. Some discussions on potential applications of INLA are made in the paper.


Using Methods From The Data-Mining And Machine-Learning Literature For Disease Classification And Prediction: A Case Study Examining Classification Of Heart Failure Subtypes, Peter C. Austin 2013 Institute for Clinical Evaluative Sciences

Using Methods From The Data-Mining And Machine-Learning Literature For Disease Classification And Prediction: A Case Study Examining Classification Of Heart Failure Subtypes, Peter C. Austin

Peter Austin

OBJECTIVE: Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine-learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines.

STUDY DESIGN AND SETTING: We compared the performance of these classification methods with that of conventional classification trees to classify patients with heart failure (HF) …


Predictive Accuracy Of Risk Factors And Markers: A Simulation Study Of The Effect Of Novel Markers On Different Performance Measures For Logistic Regression Models, Peter C. Austin 2013 Institute for Clinical Evaluative Sciences

Predictive Accuracy Of Risk Factors And Markers: A Simulation Study Of The Effect Of Novel Markers On Different Performance Measures For Logistic Regression Models, Peter C. Austin

Peter Austin

The change in c-statistic is frequently used to summarize the change in predictive accuracy when a novel risk factor is added to an existing logistic regression model. We explored the relationship between the absolute change in the c-statistic, Brier score, generalized R(2) , and the discrimination slope when a risk factor was added to an existing model in an extensive set of Monte Carlo simulations. The increase in model accuracy due to the inclusion of a novel marker was proportional to both the prevalence of the marker and to the odds ratio relating the marker to the outcome but inversely …


A Superposed Log-Linear Failure Intensity Model For Repairable Artillery Systems, Byeong Min Mun, Suk Joo Bae, Paul Kvam 2013 University of Richmond

A Superposed Log-Linear Failure Intensity Model For Repairable Artillery Systems, Byeong Min Mun, Suk Joo Bae, Paul Kvam

Department of Math & Statistics Faculty Publications

This article investigates complex repairable artillery systems that include several failure modes. We derive a superposed process based on a mixture of nonhomogeneous Poisson processes in a minimal repair model. This allows for a bathtub-shaped failure intensity that models artillery data better than currently used methods. The method of maximum likelihood is used to estimate model parameters and construct confidence intervals for the cumulative intensity of the superposed process. Finally, we propose an optimal maintenance policy for repairable systems with bathtub-shaped intensity and apply it to the artillery-failure data.


Gulf-Wide Decreases In The Size Of Large Coastal Sharks Documented By Generations Of Fishermen, Sean P. Powers, F. Joel Frodrie, Steven B. Scyphers, J. Marcus Drymon, Robert L. Shipp, Gregory W. Stunz 2013 University of South Alabama

Gulf-Wide Decreases In The Size Of Large Coastal Sharks Documented By Generations Of Fishermen, Sean P. Powers, F. Joel Frodrie, Steven B. Scyphers, J. Marcus Drymon, Robert L. Shipp, Gregory W. Stunz

University Faculty and Staff Publications

Large sharks are top predators in most coastal and marine ecosystems throughout the world, and evidence of their reduced prominence in marine ecosystems has been a serious concern for fisheries and ecosystem management. Unfortunately, quantitative data to document the extent, timing, and consequences of changes in shark populations are scarce, thwarting examination of long-term (decadal, century) trends, and reconstructions based on incomplete data sets have been the subject of debate. Absence of quantitative descriptors of past ecological conditions is a generic problem facing many fields of science but is particularly troublesome for fisheries scientists who must develop specific targets for …


Digital Commons powered by bepress