Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

2,938 Full-Text Articles 3,858 Authors 1,062,673 Downloads 127 Institutions

All Articles in Applied Statistics

Faceted Search

2,938 full-text articles. Page 7 of 79.

The Psychology Of Baseball: How The Mental Game Impacts The Physical Game, Kiera Dalmass 2018 University of Connecticut

The Psychology Of Baseball: How The Mental Game Impacts The Physical Game, Kiera Dalmass

Honors Scholar Theses

The purpose of this study was to find whether or not sports psychology can be effective. Baseball was chosen as the sport for the study because baseball can be analyzed for nearly every single factor of the game, with the exception of the mental readiness or state of the player when he steps onto the field. It therefore provides the optimal atmosphere to provide clinical and statistical support to the field of sports psychology. Despite the various, numerous pieces of literature that praise and show support for sports psychology, there hasn’t been clinical research to support it. Additionally, multiple ...


Cognitive Virtual Admissions Counselor, Kumar Raja Guvindan Raju, Cory Adams, Raghuram Srinivas 2018 Southern Methodist University

Cognitive Virtual Admissions Counselor, Kumar Raja Guvindan Raju, Cory Adams, Raghuram Srinivas

SMU Data Science Review

Abstract. In this paper, we present a cognitive virtual admissions counselor for the Master of Science in Data Science program at Southern Methodist University. The virtual admissions counselor is a system capable of providing potential students accurate information at the time that they want to know it. After the evaluation of multiple technologies, Amazon’s LEX was selected to serve as the core technology for the virtual counselor chatbot. Student surveys were leveraged to collect and generate training data to deploy the natural language capability. The cognitive virtual admissions counselor platform is currently capable of providing an end-to-end conversational dialog ...


Discretizing Continuous Attributes Of A Bayesian Network With A Birth And Death Process Based On Minimum Description Length, Nicole Woytarowicz 2018 University of Colorado, Boulder

Discretizing Continuous Attributes Of A Bayesian Network With A Birth And Death Process Based On Minimum Description Length, Nicole Woytarowicz

Applied Mathematics Graduate Theses & Dissertations

A Bayesian network is a graphical model that can be used to represent potentially complex relationships between a large number of random variables. Given data consisting of realizations of the nodes (variables) of a network with unknown structure, much work has been done in recent years to recover the directed edges that describe the joint behavior of the nodes. The underlying assumption for such recovery algorithms is that the data are either from discrete or Gaussian distributions. In the event that neither hold, structure recovery algorithms are usually run after a discretization of the data. Unfortunately, if the discretization is ...


Statistical Applications In Healthcare Systems, Maryam Mojalal 2018 The University of Western Ontario

Statistical Applications In Healthcare Systems, Maryam Mojalal

Electronic Thesis and Dissertation Repository

This thesis consists of three contributing manuscripts related to waiting times with possible applications in health care. The first manuscript is inspired by a practical problem related to decision making in an emergency department (ED). As short-run predictions of ED censuses are particularly important for efficient allocation and management of ED resources we model ED changes and present estimations for short term (hourly) ED censuses at each time point. We present a Markov-chain based algorithm to make census predictions in near future.

Considering the variation in arrival pattern and service requirements, we apply and compare three models which best describe ...


Plots, Puppies & Deadly Disease, Lizzy Younce 2018 Carroll College

Plots, Puppies & Deadly Disease, Lizzy Younce

Carroll College Student Undergraduate Research Festival

The Serengeti Health Initiative began in 2003 as a collaboration of the Lincoln Park Zoo and various universities around the world in which a team of veterinarians and researchers have been running a campaign to eliminate rabies from the Serengeti Region of Tanzania. To track the impact of the program, survey data has been collected in sixteen villages over thirteen years of the campaign. In this talk I will explain the dynamics of dog populations within the Serengeti from the unusual perspective of evolving shape versus traditional differential equation-based models. Fluctuations in the survey populations and specific villages surveyed over ...


Using Random Forests To Describe Equity In Higher Education: A Critical Quantitative Analysis Of Utah’S Postsecondary Pipelines, Tyler McDaniel 2018 University of Utah

Using Random Forests To Describe Equity In Higher Education: A Critical Quantitative Analysis Of Utah’S Postsecondary Pipelines, Tyler Mcdaniel

Butler Journal of Undergraduate Research

The following work examines the Random Forest (RF) algorithm as a tool for predicting student outcomes and interrogating the equity of postsecondary education pipelines. The RF model, created using longitudinal data of 41,303 students from Utah's 2008 high school graduation cohort, is compared to logistic and linear models, which are commonly used to predict college access and success. Substantially, this work finds High School GPA to be the best predictor of postsecondary GPA, whereas commonly used ACT and AP test scores are not nearly as important. Each model identified several demographic disparities in higher education access, most significantly ...


Evaluating The Efficacy Of Conditional Analysis Of Variance Under Heterogeneity And Non-Normality, Yan Wang, Thanh Pham, Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen, Jeffrey Kromrey, Zhiyao Yi, Yue Yin 2018 University of Massachusetts

Evaluating The Efficacy Of Conditional Analysis Of Variance Under Heterogeneity And Non-Normality, Yan Wang, Thanh Pham, Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen, Jeffrey Kromrey, Zhiyao Yi, Yue Yin

Journal of Modern Applied Statistical Methods

A simulation study was conducted to examine the efficacy of conditional analysis of variance (ANOVA) methods where the initial homogeneity of variance screening leads to the choice between the ANOVA F test and robust ANOVA methods. Type I error control and statistical power were investigated under various conditions.


Visualizing Statistical Data On United States Agriculture, Xingyao Xiao 2018 University of Minnesota, Morris

Visualizing Statistical Data On United States Agriculture, Xingyao Xiao

Undergraduate Research Symposium 2018

Food is essential to life. The United States Department of Agriculture (USDA) plays an indispensable role in ensuring people have access to healthy food. However, the data on the USDA website is not easy to access. Users must download many files to get data about animals, crops, the weather, and so on. To make the data more accessible to the general public, I built a web-based application to help make the data easy to access, explore, and compare. My application integrates multiple datasets from USDA website to provide graphic visualizations that enable users to get the exact, specific data intervals ...


The Validity Of Online Patient Ratings Of Physicians, Jennifer L. Priestley, Yiyun Zhou, Robert McGrath 2018 Kennesaw State University

The Validity Of Online Patient Ratings Of Physicians, Jennifer L. Priestley, Yiyun Zhou, Robert Mcgrath

Faculty Publications

Background: Information from ratings sites are increasingly informing patient decisions related to health care and the selection of physicians.

Objective: The current study sought to determine the validity of online patient ratings of physicians through comparison with physician peer review.

Methods: We extracted 223,715 reviews of 41,104 physicians from 10 of the largest cities in the United States, including 1142 physicians listed as “America’s Top Doctors” through physician peer review. Differences in mean online patient ratings were tested for physicians who were listed and those who were not.

Results: Overall, no differences were found between the online ...


The Devil You Don’T Know: A Spatial Analysis Of Crime At Newark’S Prudential Center On Hockey Game Days, Justin Kurland, Eric Piza 2018 Institute for Security and Crime Science - University of Waikato

The Devil You Don’T Know: A Spatial Analysis Of Crime At Newark’S Prudential Center On Hockey Game Days, Justin Kurland, Eric Piza

Journal of Sport Safety and Security

Inspired by empirical research on spatial crime patterns in and around sports venues in the United Kingdom, this paper sought to measure the criminogenic extent of 216 hockey games that took place at the Prudential Center in Newark, NJ between 2007-2016. Do games generate patterns of crime in the areas beyond the arena, and if so, for what type of crime and how far? Police-recorded data for Newark are examined using a variety of exploratory methods and non-parametric permutation tests to visualize differences in crime patterns between game and non-game days across all of Newark and the downtown area. Change ...


A Comparison Of Unsupervised Methods For Dna Microarray Leukemia Data, Denise Harness 2018 East Tennessee State University

A Comparison Of Unsupervised Methods For Dna Microarray Leukemia Data, Denise Harness

Appalachian Student Research Forum

Advancements in DNA microarray data sequencing have created the need for sophisticated machine learning algorithms and feature selection methods. Probabilistic graphical models, in particular, have been used to identify whether microarrays or genes cluster together in groups of individuals having a similar diagnosis. These clusters of genes are informative, but can be misleading when every gene is used in the calculation. First feature reduction techniques are explored, however the size and nature of the data prevents traditional techniques from working efficiently. Our method is to use the partial correlations between the features to create a precision matrix and predict which ...


Network Structure Sampling In Bayesian Networks Via Perfect Sampling From Linear Extensions, Evan Sidrow 2018 University of Colorado, Boulder

Network Structure Sampling In Bayesian Networks Via Perfect Sampling From Linear Extensions, Evan Sidrow

Applied Mathematics Graduate Theses & Dissertations

Bayesian networks are widely considered as powerful tools for modeling risk assessment, uncertainty, and decision making. They have been extensively employed to develop decision support systems in a variety of domains including medical diagnosis, risk assessment and management, human cognition, industrial process and procurement, pavement and bridge management, and system reliability. Bayesian networks are convenient graphical expressions for high dimensional probability distributions which are used to represent complex relationships between a large number of random variables. A Bayesian network is a directed acyclic graph consisting of nodes which represent random variables and arrows which correspond to probabilistic dependencies between them ...


Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia 2018 Southern Methodist University

Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia

Statistical Science Theses and Dissertations

This research contains two topics: (1) PBNPA: a permutation-based non-parametric analysis of CRISPR screen data; (2) RCRnorm: an integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data from FFPE samples.

Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single spe- cific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms. We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level ...


Score Test And Likelihood Ratio Test For Zero-Inflated Binomial Distribution And Geometric Distribution, Xiaogang Dai 2018 Western Kentucky University

Score Test And Likelihood Ratio Test For Zero-Inflated Binomial Distribution And Geometric Distribution, Xiaogang Dai

Masters Theses & Specialist Projects

The main purpose of this thesis is to compare the performance of the score test and the likelihood ratio test by computing type I errors and type II errors when the tests are applied to the geometric distribution and inflated binomial distribution. We first derive test statistics of the score test and the likelihood ratio test for both distributions. We then use the software package R to perform a simulation to study the behavior of the two tests. We derive the R codes to calculate the two types of error for each distribution. We create lots of samples to approximate ...


College Students’ Personality Traits In Relation To Career Readiness, Shelby R. Overacker, Carly E. Kalis, Francesca Coppola 2018 Gettysburg College

College Students’ Personality Traits In Relation To Career Readiness, Shelby R. Overacker, Carly E. Kalis, Francesca Coppola

Student Publications

This study examined sixty-one Gettysburg College juniors and seniors (31 males, 30 females) to measure how the Big Five personality traits, and whether a student has Type D characteristics, determines if a student is career ready. We collected data through an in-person survey, with questions about personality traits, ambition, career readiness, and demographics. Regression was used to statistically analyze our first hypothesis. The results found that there is a significant positive association between conscientiousness and career readiness, but there is no significant association between extraversion and career readiness. For the second hypothesis, a mediation model was used. We found that ...


Perceptions Of Transactional And Transformational Leaders According To Gender, Quinn I. Igram, Andrew N. Garstka, Lindsay D. Harris 2018 Gettysburg College

Perceptions Of Transactional And Transformational Leaders According To Gender, Quinn I. Igram, Andrew N. Garstka, Lindsay D. Harris

Student Publications

The lack of females occupying leadership positions in the modern workplace has prompted the research of this study. In order to better understand the perceptions that exist regarding successful leadership, this study was conducted with the intention of understanding individual leadership style through the Multifactor Leadership Questionnaire, which measures transactional and transformational leadership styles (Bass and Avolio, 1993). 64 male and female participants, made up of 36 students and 28 individuals in the workforce ages 18-61 with an average age of 31 answered 21 questions to assess their leadership style and 1 to measure who they perceived as a successful ...


Pricing Asian Options: Volatility Forecasting As A Source Of Downside Risk, Adam T. Diehl 2018 James Madison University

Pricing Asian Options: Volatility Forecasting As A Source Of Downside Risk, Adam T. Diehl

Undergraduate Economic Review

Asian options are a class of derivative securities whose payoffs average movements in the underlying asset as a means of hedging exposure to unexpected market behavior. We find that despite their volatility smoothing properties, the price of an Asian option is sensitive to the choice of volatility model employed to price them from market data. We estimate the errors induced by two common schemes of forecasting volatility and their potential impact upon trading.


On Some Ridge Regression Estimators For Logistic Regression Models, Ulyana P. Williams 2018 Florida International University

On Some Ridge Regression Estimators For Logistic Regression Models, Ulyana P. Williams

FIU Electronic Theses and Dissertations

The purpose of this research is to investigate the performance of some ridge regression estimators for the logistic regression model in the presence of moderate to high correlation among the explanatory variables. As a performance criterion, we use the mean square error (MSE), the mean absolute percentage error (MAPE), the magnitude of bias, and the percentage of times the ridge regression estimator produces a higher MSE than the maximum likelihood estimator. A Monto Carlo simulation study has been executed to compare the performance of the ridge regression estimators under different experimental conditions. The degree of correlation, sample size, number of ...


On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar 2018 Florida International University

On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar

FIU Electronic Theses and Dissertations

Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo ...


Default Priors For The Intercept Parameter In Logistic Regressions, Philip S. Boonstra, Ryan P. Barbaro, Ananda Sen 2018 The University Of Michigan

Default Priors For The Intercept Parameter In Logistic Regressions, Philip S. Boonstra, Ryan P. Barbaro, Ananda Sen

The University of Michigan Department of Biostatistics Working Paper Series

In logistic regression, separation refers to the situation in which a linear combination of predictors perfectly discriminates the binary outcome. Because finite-valued maximum likelihood parameter estimates do not exist under separation, Bayesian regressions with informative shrinkage of the regression coefficients offer a suitable alternative. Little focus has been given on whether and how to shrink the intercept parameter. Based upon classical studies of separation, we argue that efficiency in estimating regression coefficients may vary with the intercept prior. We adapt alternative prior distributions for the intercept that downweight implausibly extreme regions of the parameter space rendering less sensitivity to separation ...


Digital Commons powered by bepress