Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Statistical Models

Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels Aug 2018

Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

SMU Data Science Review

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that reviews …


Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell May 2018

Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell

Undergraduate Theses and Capstone Projects

To the outside observer, soccer is chaotic with no given pattern or scheme to follow, a random conglomeration of passes and shots that go on for 90 minutes. Yet, what if there was a pattern to the chaos, or a way to describe the events that occur in the game quantifiably. Sports statistics is a critical part of baseball and a variety of other of today’s sports, but we see very little statistics and data analysis done on soccer. Of this research, there has been looks into the effect of possession time on the outcome of a game, the difference …


Longitudinal Tracking Of Physiological State With Electromyographic Signals., Robert Warren Stallard May 2018

Longitudinal Tracking Of Physiological State With Electromyographic Signals., Robert Warren Stallard

Electronic Theses and Dissertations

Electrophysiological measurements have been used in recent history to classify instantaneous physiological configurations, e.g., hand gestures. This work investigates the feasibility of working with changes in physiological configurations over time (i.e., longitudinally) using a variety of algorithms from the machine learning domain. We demonstrate a high degree of classification accuracy for a binary classification problem derived from electromyography measurements before and after a 35-day bedrest. The problem difficulty is increased with a more dynamic experiment testing for changes in astronaut sensorimotor performance by taking electromyography and force plate measurements before, during, and after a jump from a small platform. A …


On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar Mar 2018

On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar

FIU Electronic Theses and Dissertations

Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo …


Essentials Of Structural Equation Modeling, Mustafa Emre Civelek Mar 2018

Essentials Of Structural Equation Modeling, Mustafa Emre Civelek

Zea E-Books Collection

Structural Equation Modeling is a statistical method increasingly used in scientific studies in the fields of Social Sciences. It is currently a preferred analysis method, especially in doctoral dissertations and academic researches. However, since many universities do not include this method in the curriculum of undergraduate and graduate courses, students and scholars try to solve the problems they encounter by using various books and internet resources.

This book aims to guide the researcher who wants to use this method in a way that is free from math expressions. It teaches the steps of a research program using structured equality modeling …


Building A Better Risk Prevention Model, Steven Hornyak Mar 2018

Building A Better Risk Prevention Model, Steven Hornyak

National Youth Advocacy and Resilience Conference

This presentation chronicles the work of Houston County Schools in developing a risk prevention model built on more than ten years of longitudinal student data. In its second year of implementation, Houston At-Risk Profiles (HARP), has proven effective in identifying those students most in need of support and linking them to interventions and supports that lead to improved outcomes and significantly reduces the risk of failure.


Modelling The Common Risk Among Equities Using A New Time Series Model, Jingjia Chu Feb 2018

Modelling The Common Risk Among Equities Using A New Time Series Model, Jingjia Chu

Electronic Thesis and Dissertation Repository

A new additive structure of multivariate GARCH model is proposed where the dynamic changes of the conditional correlation between the stocks are aggregated by the common risk term. The observable sequence is divided into two parts, a common risk term and an individual risk term, both following a GARCH type structure. The conditional volatility of each stock will be the sum of these two conditional variance terms. All the conditional volatility of the stock can shoot up together because a sudden peak of the common volatility is a sign of the system shock.

We provide sufficient conditions for strict stationarity …


Effect Of Socioeconomic And Demographic Factors On Kentucky Crashes, Aaron Berry Cambron Jan 2018

Effect Of Socioeconomic And Demographic Factors On Kentucky Crashes, Aaron Berry Cambron

Theses and Dissertations--Civil Engineering

The goal of this research was to examine the potential predictive ability of socioeconomic and demographic data for drivers on Kentucky crash occurrence. Identifying unique background characteristics of at-fault drivers that contribute to crash rates and crash severity may lead to improved and more specific interventions to reduce the negative impacts of motor vehicle crashes. The driver-residence zip code was used as a spatial unit to connect five years of Kentucky crash data with socioeconomic factors from the U.S. Census, such as income, employment, education, age, and others, along with terrain and vehicle age. At-fault driver crash counts, normalized over …


Implicit Copulas From Bayesian Regularized Regression Smoothers, Nadja Klein, Michael S. Smith Dec 2017

Implicit Copulas From Bayesian Regularized Regression Smoothers, Nadja Klein, Michael S. Smith

Michael Stanley Smith

We show how to extract the implicit copula of a response vector from a Bayesian regularized regression smoother with Gaussian disturbances. The copula can be used to compare smoothers that employ different shrinkage priors and function bases. We illustrate with three popular choices of shrinkage priors --- a pairwise prior, the horseshoe prior and a g prior augmented with a point mass as employed for Bayesian variable selection --- and both univariate and multivariate function bases. The implicit copulas are high-dimensional and unavailable in closed form. However, we show how to evaluate them by first constructing a Gaussian copula conditional on the regularization parameters, …