Open Access. Powered by Scholars. Published by Universities.®

Multivariate Analysis Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics

2018

Institution
Keyword
Publication
Publication Type

Articles 1 - 15 of 15

Full-Text Articles in Multivariate Analysis

The Periglacial Landscape Of Mars: Insight Into The 'Decameter-Scale Rimmed Depressions' In Utopia Planitia, Arya Bina Aug 2018

The Periglacial Landscape Of Mars: Insight Into The 'Decameter-Scale Rimmed Depressions' In Utopia Planitia, Arya Bina

Electronic Thesis and Dissertation Repository

Currently, Mars appears to be in a ‘frozen’ and ‘dry’ state, with the clear majority of the planet’s surface maintaining year-round sub-zero temperatures. However, the discovery of features consistent with landforms found in periglacial environments on Earth, suggests a climate history for Mars that may have involved freeze and thaw cycles. Such landforms include hummocky, polygonised, scalloped, and pitted terrains, as well as ice-rich deposits and gullies, along the mid- to high-latitude bands, typically with no lower than 20o N/S. The detection of near-surface and surface ice via the Phoenix lander, excavation of ice via recent impact cratering activity as …


Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels Aug 2018

Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

SMU Data Science Review

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that reviews …


Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma Jul 2018

Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma

Stevenson Center for Community and Economic Development—Student Research

Actuarial risk assessment tools increasingly have been employed in jurisdictions across the U.S. to assist courts in the decision of whether someone charged with a crime should be detained or released prior to their trial. These tools should be continually monitored and researched by independent 3rd parties to ensure that these powerful tools are being administered properly and used in the most proficient way as to provide socially optimal results. McLean County, Illinois began using the Public Safety Assessment-CourtTM (PSA-Court or simply PSA) risk assessment tool beginning in 2016. This study culls data from the McLean County Jail …


Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin Jul 2018

Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin

Masters Theses

Alzheimer's Disease (AD) affects nearly 5.4 million Americans as of 2016 and is the most common form of dementia. The disease is characterized by the presence of neurofibrillary tangles and amyloid plaques [1]. The amount of plaques are measured by Braak stage, post-mortem. It is known that AD is positively associated with hypercholesterolemia [16]. As statins are the most widely used cholesterol-lowering drug, there may be associations between statin use and AD. We hypothesize that those who use statins, specifically lipophilic statins, are more likely to have a low Braak stage in post-mortem analysis.

In order to address this hypothesis, …


Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell May 2018

Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell

Undergraduate Theses and Capstone Projects

To the outside observer, soccer is chaotic with no given pattern or scheme to follow, a random conglomeration of passes and shots that go on for 90 minutes. Yet, what if there was a pattern to the chaos, or a way to describe the events that occur in the game quantifiably. Sports statistics is a critical part of baseball and a variety of other of today’s sports, but we see very little statistics and data analysis done on soccer. Of this research, there has been looks into the effect of possession time on the outcome of a game, the difference …


Chemical And Statistical Analysis Of Karst Groundwater Basin Signatures - Springfield, Mo, Benjamin E. Lockwood May 2018

Chemical And Statistical Analysis Of Karst Groundwater Basin Signatures - Springfield, Mo, Benjamin E. Lockwood

MSU Graduate Theses

Springfield, MO is located on the Springfield Plateau physiographic province. The Springfield plateau contains a number of Mississippian aged units and is mainly capped by the Burlington-Keokuk Formation. The Burlington-Keokuk is a highly fossiliferous limestone with nodular and interbedded chert. Beneath the Burlington-Keokuk lies the Elsey, Reeds Spring, and Pierson Formations respectively which comprise the Springfield Plateau aquifer hydrostratigraphic unit. Within the Springfield Plateau aquifer, a well-developed karst system includes springs, sinkholes, and caves. The Springfield Plateau aquifer is the predominant source for springs and seeps in the Springfield area. The purpose of this study was to understand the differences …


Analysis Challenges For High Dimensional Data, Bangxin Zhao Apr 2018

Analysis Challenges For High Dimensional Data, Bangxin Zhao

Electronic Thesis and Dissertation Repository

In this thesis, we propose new methodologies targeting the areas of high-dimensional variable screening, influence measure and post-selection inference. We propose a new estimator for the correlation between the response and high-dimensional predictor variables, and based on the estimator we develop a new screening technique termed Dynamic Tilted Current Correlation Screening (DTCCS) for high dimensional variables screening. DTCCS is capable of picking up the relevant predictor variables within a finite number of steps. The DTCCS method takes the popular used sure independent screening (SIS) method and the high-dimensional ordinary least squares projection (HOLP) approach as its special cases.

Two methods …


Using Random Forests To Describe Equity In Higher Education: A Critical Quantitative Analysis Of Utah’S Postsecondary Pipelines, Tyler Mcdaniel Apr 2018

Using Random Forests To Describe Equity In Higher Education: A Critical Quantitative Analysis Of Utah’S Postsecondary Pipelines, Tyler Mcdaniel

Butler Journal of Undergraduate Research

The following work examines the Random Forest (RF) algorithm as a tool for predicting student outcomes and interrogating the equity of postsecondary education pipelines. The RF model, created using longitudinal data of 41,303 students from Utah's 2008 high school graduation cohort, is compared to logistic and linear models, which are commonly used to predict college access and success. Substantially, this work finds High School GPA to be the best predictor of postsecondary GPA, whereas commonly used ACT and AP test scores are not nearly as important. Each model identified several demographic disparities in higher education access, most significantly the effects …


On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar Mar 2018

On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar

FIU Electronic Theses and Dissertations

Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo …


Multivariate Spectral Analysis Of Crism Data To Characterize The Composition Of Mawrth Vallis, Melissa Luna Mar 2018

Multivariate Spectral Analysis Of Crism Data To Characterize The Composition Of Mawrth Vallis, Melissa Luna

Melissa Luna

No abstract provided.


Advances In Semi-Nonparametric Density Estimation And Shrinkage Regression, Hossein Zareamoghaddam Mar 2018

Advances In Semi-Nonparametric Density Estimation And Shrinkage Regression, Hossein Zareamoghaddam

Electronic Thesis and Dissertation Repository

This thesis advocates the use of shrinkage and penalty techniques for estimating the parameters of a regression model that comprises both parametric and nonparametric components and develops semi-nonparametric density estimation methodologies that are applicable in a regression context.

First, a moment-based approach whereby a univariate or bivariate density function is approximated by means of a suitable initial density function that is adjusted by a linear combination of orthogonal polynomials is introduced. Such adjustments are shown to be mathematically equivalent to making use of standard polynomials in one or two variables. Once extended to apply to density estimation, in which case …


Essentials Of Structural Equation Modeling, Mustafa Emre Civelek Mar 2018

Essentials Of Structural Equation Modeling, Mustafa Emre Civelek

Zea E-Books Collection

Structural Equation Modeling is a statistical method increasingly used in scientific studies in the fields of Social Sciences. It is currently a preferred analysis method, especially in doctoral dissertations and academic researches. However, since many universities do not include this method in the curriculum of undergraduate and graduate courses, students and scholars try to solve the problems they encounter by using various books and internet resources.

This book aims to guide the researcher who wants to use this method in a way that is free from math expressions. It teaches the steps of a research program using structured equality modeling …


Building A Better Risk Prevention Model, Steven Hornyak Mar 2018

Building A Better Risk Prevention Model, Steven Hornyak

National Youth Advocacy and Resilience Conference

This presentation chronicles the work of Houston County Schools in developing a risk prevention model built on more than ten years of longitudinal student data. In its second year of implementation, Houston At-Risk Profiles (HARP), has proven effective in identifying those students most in need of support and linking them to interventions and supports that lead to improved outcomes and significantly reduces the risk of failure.


A Preliminary Study Of Smithport Plain Bottle Morphology In The Southern Caddo Area, Robert Z. Selden Jr. Jan 2018

A Preliminary Study Of Smithport Plain Bottle Morphology In The Southern Caddo Area, Robert Z. Selden Jr.

CRHR: Archaeology

This study expands upon a previous analysis of the Clarence H. Webb collection, which resulted in the identification of two discrete shapes used in the manufacture of the base and body of Smithport Plain bottles. The sample includes the Smithport Plain bottles from the Webb collection, and four new bottles: two previously repatriated specimens in the Pohler Collection, and two from the Mitchell site (41BW4) to test whether those specimens align morphologically with the Belcher Mound or Smithport Landing specimens. Results indicate significant allometry and a significant difference in Smithport Plain body and base shapes for bottles produced at the …


Psychometric Properties Of A Working Memory Span Task, Juan M. Alzate Vanegas Jan 2018

Psychometric Properties Of A Working Memory Span Task, Juan M. Alzate Vanegas

Honors Undergraduate Theses

The intent of this thesis is to examine the psychometric properties of a complex span task (CST) developed to measure working memory capacity (WMC) using measurements obtained from a sample of 68 undergraduate students at the University of Central Florida. The Grocery List Task (GLT) promises several design improvements over traditional CSTs in a prior study about individual differences in WMC and distraction effects on driving performance, and it offers potential benefits for studying WMC as well as the serial-position effect. Currently, the working memory system is composed of domain-general memorial storage processes and information-processing, which involves the use of …