Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

2013

Discipline
Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 49

Full-Text Articles in Statistical Models

Adjusted Tornado Probabilities, Holly M. Widen, James B. Elsner, Cameron Amrine, Rizalino B. Cruz, Erik Fraza, Laura Michaels, Loury Migliorelli, Brendan Mulholland, Michael Patterson, Sarah Strazzo, Guang Xing Dec 2013

Adjusted Tornado Probabilities, Holly M. Widen, James B. Elsner, Cameron Amrine, Rizalino B. Cruz, Erik Fraza, Laura Michaels, Loury Migliorelli, Brendan Mulholland, Michael Patterson, Sarah Strazzo, Guang Xing

Publications

Tornado occurrence rates computed from the available reports are biased low relative to the unknown true rates. To correct for this low bias, the authors demonstrate a method to estimate the annual probability of being struck by a tornado that uses the average report density estimated as a function of distance from nearest city/town center. The method is demonstrated on Kansas and then applied to 15 other tornado-prone states from Nebraska to Tennessee. States are ranked according to their adjusted tornado rate and comparisons are made with raw rates published elsewhere. The adjusted rates, expressed as return periods, arestates, including …


Estimation And Inference For Spatial And Spatio-Temporal Mixed Effects Models, Casey M. Jelsema Dec 2013

Estimation And Inference For Spatial And Spatio-Temporal Mixed Effects Models, Casey M. Jelsema

Dissertations

One of the most common goals of geostatistical analysis is that of spatial prediction, in other words: filling in the blank areas of the map. There are two popular methods for accomplishing spatial prediction. Either kriging, or Bayesian hierarchical models. Both methods require the inverse of the spatial covariance matrix of the data. As the sample size, n, becomes large, both of these methods become impractical. Reduced rank spatial models (RRSM) allow prediction on massive datasets without compromising the complexity of the spatial process. This dissertation focuses on RRSMs, particularly situations where the data follow non-Gaussian distributions.

The manner in …


Observed Versus Gcm-Generated Local Tropical Cyclone Frequency: Comparisons Using A Spatial Lattice, Sarah Strazzo, Daniel J. Halperin, James Elsner, Tim Larow, Ming Zhao Nov 2013

Observed Versus Gcm-Generated Local Tropical Cyclone Frequency: Comparisons Using A Spatial Lattice, Sarah Strazzo, Daniel J. Halperin, James Elsner, Tim Larow, Ming Zhao

Publications

Of broad scientific and public interest is the reliability of global climate models (GCMs) to simulate future regional and local tropical cyclone (TC) occurrences. Atmospheric GCMs are now able to generate vortices resembling actual TCs, but questions remain about their fidelity to observed TCs. Here the authors demonstrate a spatial lattice approach for comparing actual with simulated TC occurrences regionally using observed TCs from the International Best Track Archive for Climate Stewardship (IBTrACS) dataset and GCM-generated TCs from the Geophysical Fluid Dynamics Laboratory (GFDL) High Resolution Atmospheric Model (HiRAM) and Florida State University (FSU) Center for Ocean–Atmospheric Prediction Studies (COAPS) …


Application Of The Rasch Model To Measure Five Dimensions Of Wellness In Community-Dwelling Older Adults, Kelley A. Strout Dr Nov 2013

Application Of The Rasch Model To Measure Five Dimensions Of Wellness In Community-Dwelling Older Adults, Kelley A. Strout Dr

Nursing Faculty Scholarship

Background and Purpose: Nurse researchers and practicing nurses need reliable and valid instruments to measure key clinical concepts. The purpose of this research was to develop an innovative method to measure dimensions of wellness among older adults. Method: A sample of 5,604 community-dwelling older adults was drawn from members of the COLLAGE consortium. The Wellness Assessment Tool (WEL) of the COLLAGE assessment system provided the data used to create the scores. Application of the Rasch analysis and Masters' partial credit method resulted in logit values for each item within the five dimensions of wellness as well as logit values for …


Hierarchical Vector Auto-Regressive Models And Their Applications To Multi-Subject Effective Connectivity, Cristina Gorrostieta, Mark Fiecas, Hernando Ombao, Erin Burke, Steven Cramer Oct 2013

Hierarchical Vector Auto-Regressive Models And Their Applications To Multi-Subject Effective Connectivity, Cristina Gorrostieta, Mark Fiecas, Hernando Ombao, Erin Burke, Steven Cramer

Mark Fiecas

Vector auto-regressive (VAR) models typically form the basis for constructing directed graphical models for investigating connectivity in a brain network with brain regions of interest (ROIs) as nodes. There are limitations in the standard VAR models. The number of parameters in the VAR model increases quadratically with the number of ROIs and linearly with the order of the model and thus due to the large number of parameters, the model could pose serious estimation problems. Moreover, when applied to imaging data, the standard VAR model does not account for variability in the connectivity structure across all subjects. In this paper, …


Beta Binomial Regression, Joseph M. Hilbe Oct 2013

Beta Binomial Regression, Joseph M. Hilbe

Joseph M Hilbe

Monograph on how to construct, interpret and evaluate beta, beta binomial, and zero inflated beta-binomial regression models. Stata and R code used for examples.


Data Analysis Using Regression Modeling: Visual Display And Setup Of Simple And Complex Statistical Models, Emil N. Coman, Maria A. Coman, Eugen Iordache, Russell Barbour, Lisa Dierker Sep 2013

Data Analysis Using Regression Modeling: Visual Display And Setup Of Simple And Complex Statistical Models, Emil N. Coman, Maria A. Coman, Eugen Iordache, Russell Barbour, Lisa Dierker

Yale Day of Data

We present visual modeling solutions for testing simple and more advanced statistical hypotheses in any research field. All models can be directly specified in analytical software like Mplus or R.

Data analysis in any substantive field can be easily accomplished by translating statistical tests in the intuitive language of regression-based path diagrams with observed and unobserved variables. All models we presented can be directly specified and estimated in analytical software.

Students can particularly benefit from being taught the simple regression modeling setup of the path analytical method, as it empowers them to apply the techniques to any data to test …


Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty Sep 2013

Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty

UW Biostatistics Working Paper Series

The evaluation of biomarkers to improve risk prediction is a common theme in modern research. Since its introduction in 2008, the net reclassification index (NRI) (Pencina et al. 2008, Pencina et al. 2011) has gained widespread use as a measure of prediction performance with over 1,200 citations as of June 30, 2013. The NRI is considered by some to be more sensitive to clinically important changes in risk than the traditional change in the AUC (Delta AUC) statistic (Hlatky et al. 2009). Recent statistical research has raised questions, however, about the validity of conclusions based on the NRI. (Hilden and …


Pricing And Hedging Index Options With A Dominant Constituent Stock, Helen Cheyne Aug 2013

Pricing And Hedging Index Options With A Dominant Constituent Stock, Helen Cheyne

Electronic Thesis and Dissertation Repository

In this paper, we examine the pricing and hedging of an index option where one constituents stock plays an overly dominant role in the index. Under a Geometric Brownian Motion assumption we compare the distribution of the relative value of the index if the dominant stock is modeled separately from the rest of the index, or not. The former is equivalent to the relative index value being distributed as the sum of two lognormal random variables and the latter is distributed as a single lognormal random variable. Since these are not equal in distribution, we compare the two models. The …


Modelling Credit Value Adjustment Using Defaultable Options Approach, Sidita Zhabjaku Aug 2013

Modelling Credit Value Adjustment Using Defaultable Options Approach, Sidita Zhabjaku

Electronic Thesis and Dissertation Repository

This thesis calculates Credit Value Adjustment on defaultable options. The prices of default- able European options are computed through analytical, quadrature approximation and Monte Carlo simulations under the assumption of a constant rate of default. Subsequently, we propose to inversely relate the company’s instantaneous rate of default to its underlying stock price, re- sulting in a non-constant rate of default. This allows for a new approach to estimate the default of company different from previous work where default is calculated through historical data. The rationale behind this idea relies on the fact that price of the stock plunges before the …


Sensitivity Of Limiting Hurricane Intensity To Sst In The Atlantic From Observations And Gcms, James Elsner, Sarah Strazzo, Thomas H. Jagger, Timothy Larow, Ming Zhao Aug 2013

Sensitivity Of Limiting Hurricane Intensity To Sst In The Atlantic From Observations And Gcms, James Elsner, Sarah Strazzo, Thomas H. Jagger, Timothy Larow, Ming Zhao

Publications

No abstract provided.


Frequency, Intensity, And Sensitivity To Sea Surface Temperature Of North Atlantic Tropical Cyclones In Best-Track And Simulated Data, Sarah Strazzo, James B. Elsner, Jill C. Trepanier, Kerry A. Emanuel Aug 2013

Frequency, Intensity, And Sensitivity To Sea Surface Temperature Of North Atlantic Tropical Cyclones In Best-Track And Simulated Data, Sarah Strazzo, James B. Elsner, Jill C. Trepanier, Kerry A. Emanuel

Publications

Synthetic hurricane track data generated from a downscaling approach are compared to best-track (observed) data to analyze differences in regional frequency, intensity, and sensitivity of limiting intensity to sea surface temperature (SST). Overall, the spatial distributions of observed and simulated hurricane counts match well, although there are relatively fewer synthetic storms in the eastern quarter of the basin. Additionally, regions of intense synthetic hurricanes tend to coincide with regions of intense observed hurricanes. The sensitivity of limiting hurricane intensity to SST computed from synthetic data is slightly lower than sensitivity computed from observed data (5.561.31 m s21 (standard error, SE) …


諸外国における最新のデータエディティング事情~混淆正規分布モデルによる多変量外れ値検出法の検証~(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi Aug 2013

諸外国における最新のデータエディティング事情~混淆正規分布モデルによる多変量外れ値検出法の検証~(高橋将宜、選択的エディティング、セレクティブエディティング), Masayoshi Takahashi

Masayoshi Takahashi

No abstract provided.


Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen Jul 2013

Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

A framework is presented which allows an investigator to estimate the portion of the effect of one exposure that is attributable to an interaction with a second exposure. We show that when the two exposures are independent, the total effect of one exposure can be decomposed into a conditional effect of that exposure and a component due to interaction. The decomposition applies on difference or ratio scales. We discuss how the components can be estimated using standard regression models, and how these components can be used to evaluate the proportion of the total effect of the primary exposure attributable to …


Caimans - Semantic Platform For Advance Content Mining (Sketch Wp), Salvo Reina Jul 2013

Caimans - Semantic Platform For Advance Content Mining (Sketch Wp), Salvo Reina

Salvo Reina

A middleware SW platform was created for automatic classification of textual contents. The worksheet of requirements and the original flow-sketchs are published.


Stochastic Dea With A Perfect Object And Its Application To Analysis Of Environmental Efficiency, Alexander Vaninsky Jul 2013

Stochastic Dea With A Perfect Object And Its Application To Analysis Of Environmental Efficiency, Alexander Vaninsky

Publications and Research

The paper introduces stochastic DEA with a Perfect Object (SDEA PO). The Perfect Object (PO) is a virtual Decision Making Unit (DMU) that has the smallest inputs and greatest outputs. Including the PO in a collection of actual objects yields an explicit formula of the efficiency index. Given the distributions of DEA inputs and outputs, this formula allows us to derive the probability distribution of the efficiency score, to find its mathematical expectation, and to deliver common (group–related) and partial (object-related) efficiency components. We apply this approach to a prospective analysis of environmental efficiency of the major national and regional …


Inferences About Parameters Of Trivariate Normal Distribution With Missing Data, Xing Wang Jul 2013

Inferences About Parameters Of Trivariate Normal Distribution With Missing Data, Xing Wang

FIU Electronic Theses and Dissertations

Multivariate normal distribution is commonly encountered in any field, a frequent issue is the missing values in practice. The purpose of this research was to estimate the parameters in three-dimensional covariance permutation-symmetric normal distribution with complete data and all possible patterns of incomplete data. In this study, MLE with missing data were derived, and the properties of the MLE as well as the sampling distributions were obtained. A Monte Carlo simulation study was used to evaluate the performance of the considered estimators for both cases when ρ was known and unknown. All results indicated that, compared to estimators in the …


Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh Jun 2013

Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in estimation-sample (one of the V subsamples) and corresponding complementary parameter-generating sample that is used to generate a target parameter. For each of the V parameter-generating samples, we apply an algorithm that maps the sample in a target parameter mapping which represent the statistical target parameter generated by that parameter-generating …


A Robust Estimate For The Bifurcating Autoregressive Model With Application To Cell Lineage Data, Tamer M. E. Elbayoumi Jun 2013

A Robust Estimate For The Bifurcating Autoregressive Model With Application To Cell Lineage Data, Tamer M. E. Elbayoumi

Dissertations

The bifurcating autoregressive model (BAR) is commonly used to model binary tree data. One application for this model relates to cell lineage data in biology. The purpose of studying the cell lineage process is to know whether the observed correlations between related cells are due to similarities in the environmental, inherited effects, or a combination of both of them. Because outliers in this kind of data are quite common, the need for a robust estimation procedure is necessary. A weighted L1 (WL1) estimate for estimating the parameters of the BAR model is considered. When the weights are constant, the estimate …


Iterative Statistical Verification Of Probabilistic Plans, Colin M. Potts May 2013

Iterative Statistical Verification Of Probabilistic Plans, Colin M. Potts

Lawrence University Honors Projects

Artificial intelligence seeks to create intelligent agents. An agent can be anything: an autopilot, a self-driving car, a robot, a person, or even an anti-virus system. While the current state-of-the-art may not achieve intelligence (a rather dubious thing to quantify) it certainly achieves a sense of autonomy. A key aspect of an autonomous system is its ability to maintain and guarantee safety—defined as avoiding some set of undesired outcomes. The piece of software responsible for this is called a planner, which is essentially an automated problem solver. An advantage computer planners have over humans is their ability to consider and …


Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan May 2013

Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This paper describes a targeted maximum likelihood estimator (TMLE) for the parameters of longitudinal static and dynamic marginal structural models. We consider a longitudinal data structure consisting of baseline covariates, time-dependent intervention nodes, intermediate time-dependent covariates, and a possibly time dependent outcome. The intervention nodes at each time point can include a binary treatment as well as a right-censoring indicator. Given a class of dynamic or static interventions, a marginal structural model is used to model the mean of the intervention specific counterfactual outcome as a function of the intervention, time point, and possibly a subset of baseline covariates. Because …


Modeling A Sensor To Improve Its Efficacy, Nabin K. Malakar, Daniil Gladkov, Kevin H. Knuth May 2013

Modeling A Sensor To Improve Its Efficacy, Nabin K. Malakar, Daniil Gladkov, Kevin H. Knuth

Physics Faculty Scholarship

Robots rely on sensors to provide them with information about their surroundings. However, high-quality sensors can be extremely expensive and cost-prohibitive. Thus many robotic systems must make due with lower-quality sensors. Here we demonstrate via a case study how modeling a sensor can improve its efficacy when employed within a Bayesian inferential framework. As a test bed we employ a robotic arm that is designed to autonomously take its own measurements using an inexpensive LEGO light sensor to estimate the position and radius of a white circle on a black field. The light sensor integrates the light arriving from a …


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan May 2013

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan May 2013

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

Laura B. Balzer

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards May 2013

Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards

Doctoral Dissertations

Many key decisions and design policies are made using sophisticated computer simulations. However, these sophisticated computer simulations have several major problems. The two main issues are 1) gaps between the simulation model and the actual structure, and 2) limitations of the modeling engine's capabilities. This dissertation's goal is to address these simulation deficiencies by presenting a general automated process for tuning simulation inputs such that simulation output matches real world measured data. The automated process involves the following key components -- 1) Identify a model that accurately estimates the real world simulation calibration target from measured sensor data; 2) Identify …


Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong May 2013

Integrative Biomarker Identification And Classification Using High Throughput Assays, Pan Tong

Dissertations & Theses (Open Access)

It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays …


Modeling The Relationship Between Coal Mining And Respiratory Health In West Virginia, Jessica Welch May 2013

Modeling The Relationship Between Coal Mining And Respiratory Health In West Virginia, Jessica Welch

Chancellor’s Honors Program Projects

No abstract provided.


Does The Sat Predict Academic Achievement And Academic Choices At Macalester College?, Jing Wen May 2013

Does The Sat Predict Academic Achievement And Academic Choices At Macalester College?, Jing Wen

Mathematics, Statistics, and Computer Science Honors Projects

This paper examines the predictive power of the Scholastic Aptitude Test (SAT) for Macalester students’ college success and academic choices. We use linear regression to study whether the SAT can predict students’ first year or four-year grades. Using Kullback-Leibler divergence and classification trees, we also examine the SAT’s predictive ability for other aspects of students’ academic experience, for example, major selection, or academic division of study. After controlling for major and course level, we find that the SAT does not explain a large proportion of the variability in Macalester students’ college success. However, the SAT does provide some useful information …


Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon Apr 2013

Seasonal Decomposition For Geographical Time Series Using Nonparametric Regression, Hyukjun Gweon

Electronic Thesis and Dissertation Repository

A time series often contains various systematic effects such as trends and seasonality. These different components can be determined and separated by decomposition methods. In this thesis, we discuss time series decomposition process using nonparametric regression. A method based on both loess and harmonic regression is suggested and an optimal model selection method is discussed. We then compare the process with seasonal-trend decomposition by loess STL (Cleveland, 1979). While STL works well when that proper parameters are used, the method we introduce is also competitive: it makes parameter choice more automatic and less complex. The decomposition process often requires that …


A New Diagnostic Test For Regression, Yun Shi Apr 2013

A New Diagnostic Test For Regression, Yun Shi

Electronic Thesis and Dissertation Repository

A new diagnostic test for regression and generalized linear models is discussed. The test is based on testing if the residuals are close together in the linear space of one of the covariates are correlated. This is a generalization of the famous problem of spurious correlation in time series regression. A full model building approach for the case of regression was developed in Mahdi (2011, Ph.D. Thesis, Western University, ”Diagnostic Checking, Time Series and Regression”) using an iterative generalized least squares algorithm. Simulation experiments were reported that demonstrate the validity and utility of this approach but no actual applications were …