Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

1141 Full-Text Articles 1379 Authors 272825 Downloads 87 Institutions

All Articles in Statistical Models

Faceted Search

1141 full-text articles. Page 1 of 31.

Models As Weapons: Review Of Weapons Of Math Destruction: How Big Data Increases Inequality And Threatens Democracy By Cathy O’Neil (2016), Samuel L. Tunstall 2018 Michigan State University

Models As Weapons: Review Of Weapons Of Math Destruction: How Big Data Increases Inequality And Threatens Democracy By Cathy O’Neil (2016), Samuel L. Tunstall

Numeracy

Cathy O’Neil. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (New York, NY: Crown) 272 pp. ISBN 978-0553418811.

Accessible to a wide readership, Cathy O’Neil’s Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy provides a lucid yet alarming account of the extensive reach of mathematical models in influencing all of our lives. With a particular eye towards social justice, O’Neil not only warns modelers to be cognizant of the effects of their work on real people—especially vulnerable groups who have less power to fight back—but also ...


Modeling The Decline In English Passivization, Liwen Hou, David Smith 2018 Northeastern University

Modeling The Decline In English Passivization, Liwen Hou, David Smith

Proceedings of the Society for Computation in Linguistics

Evidence from the Hansard corpus shows that the passive voice in British English has declined in relative frequency over the last two centuries. We investigate which factors are predictive of whether transitive verb phrases are passivized. We show the increasing importance of the person-hierarchy effects observed by Bresnan et al. (2001), with increasing strength of the constraint against passivizing clauses with local agents, as well as the rising prevalence of such agents. Moreover, our ablation experiments on the Wall Street Journal and Hansard corpora provide support for the unmarked information structure of ‘given’ before ‘new’ noted by Halliday (1967).


Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea 2017 University of Arkansas, Fayetteville

Bayesian Model For Detection Of Outliers In Linear Regression With Application To Longitudinal Data, Zahraa Al-Sharea

Theses and Dissertations

Outlier detection is one of the most important challenges with many present-day applications. Outliers can occur due to uncertainty in data generating mechanisms or due to an error in data recording/processing. Outliers can drastically change the study's results and make predictions less reliable. Detecting outliers in longitudinal studies is quite challenging because this kind of study is working with observations that change over time. Therefore, the same subject can produce an outlier at one point in time produce regular observations at all other time points. A Bayesian hierarchical modeling assigns parameters that can quantify whether each observation is ...


Making Models With Bayes, Pilar Olid 2017 California State University, San Bernardino

Making Models With Bayes, Pilar Olid

Electronic Theses, Projects, and Dissertations

Bayesian statistics is an important approach to modern statistical analyses. It allows us to use our prior knowledge of the unknown parameters to construct a model for our data set. The foundation of Bayesian analysis is Bayes' Rule, which in its proportional form indicates that the posterior is proportional to the prior times the likelihood. We will demonstrate how we can apply Bayesian statistical techniques to fit a linear regression model and a hierarchical linear regression model to a data set. We will show how to apply different distributions to Bayesian analyses and how the use of a prior affects ...


Variational Bayes Estimation Of Time Series Copulas For Multivariate Ordinal And Mixed Data, Ruben Loaiza-Maya, Michael S. Smith 2017 Melbourne Business School

Variational Bayes Estimation Of Time Series Copulas For Multivariate Ordinal And Mixed Data, Ruben Loaiza-Maya, Michael S. Smith

Michael Stanley Smith

We propose a new variational Bayes method for estimating high-dimensional copulas with discrete, or discrete and continuous, margins. The method is based on a variational approximation to a tractable augmented posterior, and is substantially faster than previous likelihood-based approaches. We use it to estimate drawable vine copulas for univariate and multivariate Markov ordinal and mixed time series. These have dimension $rT$, where $T$ is the number of observations and $r$ is the number of series, and are difficult to estimate using previous methods. The vine pair-copulas are carefully selected to allow for heteroskedasticity, which is a common feature of ordinal ...


Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu 2017 The University of Western Ontario

Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu

Electronic Thesis and Dissertation Repository

Motivated by some real problems, our thesis puts forward two general two-period pricing models and explore optimal buying and selling strategies in two states of the two-period decision, when buyer/seller's decisions in the two periods are uncertain: commodity valuations may or may not be independent, may or may not follow the same distribution, be heavily or just lightly influenced by exogenous economic conditions, and so on. For both the example of buying laptops and the example of selling houses, the connections between each example and the two-envelope paradox encourage us to explore optimal strategies based on the works ...


Data-Adaptive Kernel Support Vector Machine, Xin Liu 2017 The University of Western Ontario

Data-Adaptive Kernel Support Vector Machine, Xin Liu

Electronic Thesis and Dissertation Repository

In this thesis, we propose the data-adaptive kernel Support Vector Machine (SVM), a new method with a data-driven scaling kernel function based on real data sets. This two-stage approach of kernel function scaling can enhance the accuracy of a support vector machine, especially when the data are imbalanced. Followed by the standard SVM procedure in the first stage, the proposed method locally adapts the kernel function to data locations based on the skewness of the class outcomes. In the second stage, the decision rule is constructed with the data-adaptive kernel function and is used as the classifier. This process enlarges ...


Using Multivariate Statistical Techniques To Aid In A Sports Index Construction, Tiffany Kelly 2017 ESPN Stats & Information Group

Using Multivariate Statistical Techniques To Aid In A Sports Index Construction, Tiffany Kelly

Mathematics Colloquium Series

Within a quantitative career, you are/will soon be challenged to create an overall value to explain a situational status. For example, socio-economic status, well-being, and in this specific example, happiness among sports fans. This talk seeks to discuss my previous work developed out from student research performed at NSU in its application to my first project for ESPN Sports Analytics, the College Football Fan Happiness Index (http://es.pn/2vmParA) . I will dive into the multivariate statistical techniques of principal component analysis and hierarchal clustering to create this happiness index from a slew of variables.


Latent Storm Factors And Their Indicators, Joy D'Andrea 2017 Illinois State University

Latent Storm Factors And Their Indicators, Joy D'Andrea

Annual Symposium on Biomathematics and Ecology: Education and Research

No abstract provided.


Handguns And Hotspots: Spatio- Temporal Models For Gun Violence In Chicago,Il, Shelby Scott 2017 University of Tennessee, Knoxville

Handguns And Hotspots: Spatio- Temporal Models For Gun Violence In Chicago,Il, Shelby Scott

Annual Symposium on Biomathematics and Ecology: Education and Research

No abstract provided.


On The Estimation Of Penetrance In The Presence Of Competing Risks With Family Data, Daniel Prawira 2017 The University of Western Ontario

On The Estimation Of Penetrance In The Presence Of Competing Risks With Family Data, Daniel Prawira

Electronic Thesis and Dissertation Repository

In family studies, we are interested in estimating the penetrance function of the event of interest in the presence of competing risks. Failure to account for competing risks may lead to bias in the estimation of the penetrance function. In this thesis, three statistical challenges are addressed: clustering, missing data, and competing risks. We proposed the cause-specific model with shared frailty and ascertainment correction to account for clustering and competing risks along with ascertainment of families into study. Multiple imputation is used to account for missing data. The simulation study showed good performance of our proposed model in estimating the ...


Annuity Product Valuation And Risk Measurement Under Correlated Financial And Longevity Risks, Soohong Park 2017 The University of Western Ontario

Annuity Product Valuation And Risk Measurement Under Correlated Financial And Longevity Risks, Soohong Park

Electronic Thesis and Dissertation Repository

Longevity risk is a non-diversifiable risk and regarded as a pressing socio-economic challenge of the century. Its accurate assessment and quantification is therefore critical to enable pension-fund companies provide sustainable old-age security and maintain a resilient global insurance market. Fluctuations and a decreasing trend in mortality rates, which give rise to longevity risk, as well as the uncertainty in interest-rate dynamics constitute the two fundamental determinants in pricing and risk management of longevity-dependent products. We also note that historical data reveal some evidence of strong correlation between mortality and interest rates and must be taken into account when modelling their ...


Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek 2017 Stephen F Austin State University

Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek

Electronic Theses and Dissertations

ABSTRACT

Examination and Comparison of the Performance of Common Non-Parametric and Robust Regression Models

By

Gregory Frank Malek

Stephen F. Austin State University, Masters in Statistics Program,

Nacogdoches, Texas, U.S.A.

g_m_2002@live.com

This work investigated common alternatives to the least-squares regression method in the presence of non-normally distributed errors. An initial literature review identified a variety of alternative methods, including Theil Regression, Wilcoxon Regression, Iteratively Re-Weighted Least Squares, Bounded-Influence Regression, and Bootstrapping methods. These methods were evaluated using a simple simulated example data set, as well as various real data sets, including math proficiency data, Belgian telephone ...


Thermodynamics Of Coherent Structures Near Phase Transitions, Julia M. Meyer, Ivan Christov 2017 Purdue University

Thermodynamics Of Coherent Structures Near Phase Transitions, Julia M. Meyer, Ivan Christov

The Summer Undergraduate Research Fellowship (SURF) Symposium

Phase transitions within large-scale systems may be modeled by nonlinear stochastic partial differential equations in which system dynamics are captured by appropriate potentials. Coherent structures in these systems evolve randomly through time; thus, statistical behavior of these fields is of greater interest than particular system realizations. The ability to simulate and predict phase transition behavior has many applications, from material behaviors (e.g., crystallographic phase transformations and coherent movement of granular materials) to traffic congestion. Past research focused on deriving solutions to the system probability density function (PDF), which is the ground-state wave function squared. Until recently, the extent to ...


Using Mountain Snowpack To Predict Summer Water Availability In Semiarid Mountain Watersheds, Rebecca Dawn Garst 2017 Boise State University

Using Mountain Snowpack To Predict Summer Water Availability In Semiarid Mountain Watersheds, Rebecca Dawn Garst

Boise State University Theses and Dissertations

In the mountainous landscapes of the western United States, water resources are dominated by snowpack. As temperatures rise in spring and summer, the melting snow produces an increase in river flow levels. Reservoirs are used during this increase to retain surplus water, which is released to supplement growing season water supply once the peak flows decrease to below water demands. Once there is no longer surplus natural flow of water, the water accounting changes – referred to as the day of allocation (DOA), and water previously retained within the reservoir is used to supplement the lower flow levels. The amount of ...


A Characterization Of A Value Added Model And A New Multi-Stage Model For Estimating Teacher Effects Within Small School Systems, Julie M. Garai 2017 University of Nebraska-Lincoln

A Characterization Of A Value Added Model And A New Multi-Stage Model For Estimating Teacher Effects Within Small School Systems, Julie M. Garai

Dissertations and Theses in Statistics

At both the national and state level there is increasing pressure to develop metrics to determine if school systems are meeting educational objectives. All states mandate some form of assessment by standardized tests. One method currently used to model student test scores is Value Added Modeling (VAM), which models student scores as a product of classroom and school environments. One VAM approach is the Tennessee Value Added Assessment System (TVAAS) which models student gains from year to year. Teacher effects are included in this layered model, which estimates the teacher’s added value to a student score through best linear ...


Application Of Support Vector Machine Modeling And Graph Theory Metrics For Disease Classification, Jessica M. Rudd 2017 Kennesaw State University

Application Of Support Vector Machine Modeling And Graph Theory Metrics For Disease Classification, Jessica M. Rudd

Grey Literature from PhD Candidates

Disease classification is a crucial element of biomedical research. Recent studies have demonstrated that machine learning techniques, such as Support Vector Machine (SVM) modeling, produce similar or improved predictive capabilities in comparison to the traditional method of Logistic Regression. In addition, it has been found that social network metrics can provide useful predictive information for disease modeling. In this study, we combine simulated social network metrics with SVM to predict diabetes in a sample of data from the Behavioral Risk Factor Surveillance System. In this dataset, Logistic Regression outperformed SVM with ROC index of 81.8 and 81.7 for ...


Bayesian Model Averaging With Change Points To Assess The Impact Of Vaccination And Public Health Interventions., Esra Kürüm, Joshua L Warren, Cynthia Schuck-Paim, Roger Lustig, Joseph A Lewnard, Rodrigo Fuentes, Christian A W Bruhn, Robert J Taylor, Lone Simonsen, Daniel M Weinberger 2017 George Washington University

Bayesian Model Averaging With Change Points To Assess The Impact Of Vaccination And Public Health Interventions., Esra Kürüm, Joshua L Warren, Cynthia Schuck-Paim, Roger Lustig, Joseph A Lewnard, Rodrigo Fuentes, Christian A W Bruhn, Robert J Taylor, Lone Simonsen, Daniel M Weinberger

Global Health Faculty Publications

Background: Pneumococcal conjugate vaccines (PCVs) prevent invasive pneumococcal disease and pneumonia. However, some low-and middle-income countries have yet to introduce PCV into their immunization programs due, in part, to lack of certainty about the potential impact. Assessing PCV benefits is challenging because specific data on pneumococcal disease are often lacking, and it can be difficult to separate the effects of factors other than the vaccine that could also affect pneumococcal disease rates.

Methods: We assess PCV impact by combining Bayesian model averaging with change-point models to estimate the timing and magnitude of vaccine-associated changes, while controlling for seasonality and other ...


Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li 2017 The University of Western Ontario

Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li

Electronic Thesis and Dissertation Repository

Large and sparse datasets, such as user ratings over a large collection of items, are common in the big data era. Many applications need to classify the users or items based on the high-dimensional and sparse data vectors, e.g., to predict the profitability of a product or the age group of a user, etc. Linear classifiers are popular choices for classifying such datasets because of their efficiency. In order to classify the large sparse data more effectively, the following important questions need to be answered.

1. Sparse data and convergence behavior. How different properties of a dataset, such as ...


Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei 2017 STATinMED Research/SIMR, Inc.

Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei

Publications and Research

Comparative data on the burden of atopic dermatitis (AD) in adults relative to the general population are limited. We performed a large-scale evaluation of the burden of disease among US adults with AD relative to matched non-AD controls, encompassing comorbidities, healthcare resource utilization (HCRU), and costs, using healthcare claims data. The impact of AD disease severity on these outcomes was also evaluated.


Digital Commons powered by bepress