Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

973 Full-Text Articles 1,476 Authors 272,825 Downloads 103 Institutions

All Articles in Statistical Models

Faceted Search

973 full-text articles. Page 1 of 30.

Snakebite Dynamics Of Colombia: Effects Of Precipitation Seasonality Of Incidence, Carlos Cruz 2018 Illinois State University

Snakebite Dynamics Of Colombia: Effects Of Precipitation Seasonality Of Incidence, Carlos Cruz

Annual Symposium on Biomathematics and Ecology: Education and Research

No abstract provided.


Identifying Treatment Effects In The Presence Of Confounded Types, Desire Kedagni 2018 Iowa State University

Identifying Treatment Effects In The Presence Of Confounded Types, Desire Kedagni

Economics Working Papers

In this paper, I consider identification of treatment effects when
the treatment is endogenous. The use of instrumental variables is a popular
solution to deal with endogeneity, but this may give misleading answers when
the instrument is invalid. I show that when the instrument is invalid due to
correlation with the first stage unobserved heterogeneity, a second (also
possibly invalid) instrument allows to partially identify not only the local
average treatment effect but also the entire potential outcomes distributions
for compliers. I exploit the fact that the distribution of the observed
outcome in each group defined by the treatment and ...


Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, Alfeo Sabay, Laurie Harris, Vivek Bejugama, Karen Jaceldo-Siegl 2018 Southern Methodist University

Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, Alfeo Sabay, Laurie Harris, Vivek Bejugama, Karen Jaceldo-Siegl

SMU Data Science Review

In this paper, we present a heart disease prediction use case showing how synthetic data can be used to address privacy concerns and overcome constraints inherent in small medical research data sets. While advanced machine learning algorithms, such as neural networks models, can be implemented to improve prediction accuracy, these require very large data sets which are often not available in medical or clinical research. We examine the use of surrogate data sets comprised of synthetic observations for modeling heart disease prediction. We generate surrogate data, based on the characteristics of original observations, and compare prediction accuracy results achieved from ...


Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John 2018 Southern Methodist University

Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John

SMU Data Science Review

In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age ...


Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels 2018 Southern Methodist University

Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

SMU Data Science Review

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that ...


Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra 2018 Southern Methodist University

Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra

SMU Data Science Review

In this paper, we present a method for predicting changes in Bitcoin and Ethereum prices utilizing Twitter data and Google Trends data. Bitcoin and Ethereum, the two largest cryptocurrencies in terms of market capitalization represent over \$160 billion dollars in combined value. However, both Bitcoin and Ethereum have experienced significant price swings on both daily and long term valuations. Twitter is increasingly used as a news source influencing purchase decisions by informing users of the currency and its increasing popularity. As a result, quickly understanding the impact of tweets on price direction can provide a purchasing and selling advantage to ...


Coastal Wetland Dynamics Under Sea-Level Rise And Wetland Restoration In The Northern Gulf Of Mexico Using Bayesian Multilevel Models And A Web Tool, Tyler Hardy 2018 The University of Southern Mississippi

Coastal Wetland Dynamics Under Sea-Level Rise And Wetland Restoration In The Northern Gulf Of Mexico Using Bayesian Multilevel Models And A Web Tool, Tyler Hardy

Master's Theses

There is currently a lack of modeling framework to predict how relative sea-level rise (SLR), combined with restoration activities, affects landscapes of coastal wetlands with uncertainties accounted for at the entire northern Gulf of Mexico (NGOM). I developed such a modeling framework – Bayesian multi-level models to study the spatial pattern of wetland loss in the NGOM, driven by relative RSLR, vegetation productivity, tidal range, coastal slope, and wave height – all interacting with river-borne sediment availability, indicated by hydrological regimes. These interactions have not been comprehensively investigated before. I further modified this model to assess the efficacy of restoration projects from ...


Testing Hypotheses Of Covariance Structure In Multivariate Data, Miguel Fonseca, Arkadiusz Koziol, Roman Zmyslony 2018 NOVA University of Lisbon

Testing Hypotheses Of Covariance Structure In Multivariate Data, Miguel Fonseca, Arkadiusz Koziol, Roman Zmyslony

Electronic Journal of Linear Algebra

In this paper there is given a new approach for testing hypotheses on the structure of covariance matrices in double multivariate data. It is proved that ratio of positive and negative parts of best unbiased estimators (BUE) provide an F-test for independence of blocks variables in double multivariate models.


Application Of Jordan Algebra For Testing Hypotheses About Structure Of Mean Vector In Model With Block Compound Symmetric Covariance Structure, Roman Zmyślony, Ivan Zezula, Arkadiusz Kozioł 2018 Faculty of Mathematics, Computer Science and Econometrics, University of Zielona Góra,

Application Of Jordan Algebra For Testing Hypotheses About Structure Of Mean Vector In Model With Block Compound Symmetric Covariance Structure, Roman Zmyślony, Ivan Zezula, Arkadiusz Kozioł

Electronic Journal of Linear Algebra

In this article authors derive test for structure of mean vector in model with block compound symmetric covariance structure for two-level multivariate observations. One possible structure is so called structured mean vector when its components remain constant over sites or over time points, so that mean vector is of the form $\boldsymbol{1}_{u}\otimes\boldsymbol{\mu}$ with $\boldsymbol{\mu}=(\mu_1,\mu_2,\ldots,\mu_m)'\in\mathbb{R}^m$. This hypothesis is tested against alternative of unstructured mean vector, which can change over sites or over time points.


Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, Aisha Asiri 2018 Atlanta University Center

Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, Aisha Asiri

Electronic Theses & Dissertations Collection for Atlanta University & Clark Atlanta University

This thesis presents various models to the fashion industry to predict the profits for some products. To determine the expected performance of each product in 2016, we used tools of game theory to help us identify the expected value. We went further and performed a simple linear regression and used scatter plots to help us predict further the performance of the products of Prada. We used tools of game theory, analytics, and statistics to help us predict the performance of some of Prada's products. We also used the Tableau platform to visualize an overview of the products' performances. All ...


Deep Machine Learning For Mechanical Performance And Failure Prediction, Elijah Reber, Nickolas D. Winovich, Guang Lin 2018 Penn State University

Deep Machine Learning For Mechanical Performance And Failure Prediction, Elijah Reber, Nickolas D. Winovich, Guang Lin

The Summer Undergraduate Research Fellowship (SURF) Symposium

Deep learning has provided opportunities for advancement in many fields. One such opportunity is being able to accurately predict real world events. Ensuring proper motor function and being able to predict energy output is a valuable asset for owners of wind turbines. In this paper, we look at how effective a deep neural network is at predicting the failure or energy output of a wind turbine. A data set was obtained that contained sensor data from 17 wind turbines over 13 months, measuring numerous variables, such as spindle speed and blade position and whether or not the wind turbine experienced ...


Efvs Effects On Pilot Performance, Michael Campbell, Nsikak Udo-Imeh, Steven J. Landry 2018 Purdue University

Efvs Effects On Pilot Performance, Michael Campbell, Nsikak Udo-Imeh, Steven J. Landry

The Summer Undergraduate Research Fellowship (SURF) Symposium

Flight tests have been conducted at Purdue University using a computer-based flying simulator in an attempt to determine and measure the effects of Enhanced Flight Vision Systems (EFVS) on the performance of pilots during landing. Knowledge of these effects could help guide future design and implementation of EFVS in modern commercial aircraft, and further increase pilots’ ability to control the aircraft in low-visibility conditions. The problem that has faced researchers in the past has revolved around the difficulty in interpreting the data which is generated by these tests. The difficulty in making a generalized conclusion based on the large amount ...


Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister 2018 East Tennessee State University

Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister

Electronic Theses and Dissertations

A probability distribution is a statistical function that describes the probability of possible outcomes in an experiment or occurrence. There are many different probability distributions that give the probability of an event happening, given some sample size n. An important question in statistics is to determine the distribution of the sum of independent random variables when the sample size n is fixed. For example, it is known that the sum of n independent Bernoulli random variables with success probability p is a Binomial distribution with parameters n and p: However, this is not true when the sample size is not ...


Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, Nishantha Janith Chandrasena Poddiwala Hewage 2018 Stephen F Austin State University

Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, Nishantha Janith Chandrasena Poddiwala Hewage

Electronic Theses and Dissertations

This thesis is based on a Poisson model that uses both error-free data and error-prone data subject to misclassification in the form of false-negative and false-positive counts. We present maximum likelihood estimators (MLEs), Fisher's Information, and Wald statistics for Poisson rate parameter and the two misclassification parameters. Next, we invert the Wald statistics to get asymptotic confidence intervals for Poisson rate parameter and false-negative rate parameter. The coverage and width properties for various sample size and parameter configurations are studied via a simulation study. Finally, we apply the MLEs and confidence intervals to one real data set and another ...


Bayesian Sparse Propensity Score Estimation For Unit Nonresponse, Hejian Sang, Gyuhyeong Goh, Jae Kwang Kim 2018 Iowa State University

Bayesian Sparse Propensity Score Estimation For Unit Nonresponse, Hejian Sang, Gyuhyeong Goh, Jae Kwang Kim

Statistics Preprints

Nonresponse weighting adjustment using propensity score is a popular method for handling unit nonresponse. However, including all available auxiliary variables into the propensity model can lead to inefficient and inconsistent estimation, especially with high-dimensional covariates. In this paper, a new Bayesian method using the Spike-and-Slab prior is proposed for sparse propensity score estimation. The proposed method is not based on any model assumption on the outcome variable and is computationally efficient. Instead of doing model selec- tion and parameter estimation separately as in many frequentist methods, the proposed method simultaneously selects the sparse response probability model and provides consistent parameter ...


Fuel Flow Reduction Impact Analysis Of Drag Reducing Film Applied To Aircraft Wings, Damon Resnick, Chris Donlan, Nimish Sakalle, Cody Pinkerman 2018 Southern Methodist University

Fuel Flow Reduction Impact Analysis Of Drag Reducing Film Applied To Aircraft Wings, Damon Resnick, Chris Donlan, Nimish Sakalle, Cody Pinkerman

SMU Data Science Review

In this paper, we present an analysis of flight data in order to determine whether the application of the Edge Aerodynamix Conformal Vortex Generator (CVG), applied to the wings of aircraft, reduces fuel flow during cruising conditions of flight. The CVG is a special treatment and film applied to the wings of an aircraft to protect the wings and reduce the non-laminar flow of air around the wings during flight. It is thought that by reducing the non-laminar flow or vortices around and directly behind the wings that an aircraft will move more smoothly through the air and provide a ...


Predictions Generated From A Simulation Engine For Gene Expression Micro-Arrays For Use In Research Laboratories, Gopinath R. Mavankal, John Blevins, Dominique Edwards, Monnie McGee, Andrew Hardin 2018 Southern Methodist University

Predictions Generated From A Simulation Engine For Gene Expression Micro-Arrays For Use In Research Laboratories, Gopinath R. Mavankal, John Blevins, Dominique Edwards, Monnie Mcgee, Andrew Hardin

SMU Data Science Review

In this paper we introduce the technical components, the biology and data science involved in the use of microarray technology in biological and clinical research. We discuss how laborious experimental protocols involved in obtaining this data used in laboratories could benefit from using simulations of the data. We discuss the approach used in the simulation engine from [7]. We use this simulation engine to generate a prediction tool in Power BI, a Microsoft, business intelligence tool for analytics and data visualization [22]. This tool could be used in any laboratory using micro-arrays to improve experimental design by comparing how predicted ...


Predicting Game Day Outcomes In National Football League Games, Josh Klein, Anna Frowein, Chris Irwin 2018 Southern Methodist University

Predicting Game Day Outcomes In National Football League Games, Josh Klein, Anna Frowein, Chris Irwin

SMU Data Science Review

In this paper, we present a model for predicting the game day outcomes of National Football League games. 3 of the most popular sources for game day predictions are analyzed for comparison. Player data and outcomes from previous games are used, but we also incorporate several weather factors into our models. Over 1,700 games were incorporated and 3 separate models are created using simple regression, principal component analysis, and a recursive model. We also discuss the ethicality of using data science techniques by individuals with the knowledge in order to gain an advantage over a population lacking this specialized ...


Mathematical Models, Patty Wagner, Marnie Phipps 2018 University of North Georgia

Mathematical Models, Patty Wagner, Marnie Phipps

Mathematics Grants Collections

This Grants Collection for Mathematical Models was created under a Round Nine ALG Textbook Transformation Grant.

Affordable Learning Georgia Grants Collections are intended to provide faculty with the frameworks to quickly implement or revise the same materials as a Textbook Transformation Grants team, along with the aims and lessons learned from project teams during the implementation process.

Documents are in .pdf format, with a separate .docx (Word) version available for download. Each collection contains the following materials:

  • Linked Syllabus
  • Initial Proposal
  • Final Report


Association Tests For Genetic Effect And Its Interaction With Environmental Factors, Zhengyang Zhou 2018 Southern Methodist University

Association Tests For Genetic Effect And Its Interaction With Environmental Factors, Zhengyang Zhou

Statistical Science Theses and Dissertations

My research is in the area of statistical genetics, and it contains three projects: (1) Differentiating the Cochran-Armitage (CA) trend test and Pearson’s chi-square test: location and dispersion; (2) Decomposing Pearson’s chi-square test: a linear regression and its departure from linearity; (3) Testing nonlinear gene-environment (GxE) interaction through varying coefficient and linear mixed models.

(1) In genetic case-control association studies, a standard practice is to perform the CA trend test with 1 degree-of-freedom (df) under the assumption of an additive model. However, when the true genetic model is recessive or near recessive, it is outperformed by Pearson’s ...


Digital Commons powered by bepress