Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics

Theses/Dissertations

2017

Institution
Keyword
Publication

Articles 1 - 21 of 21

Full-Text Articles in Statistical Models

Statistical Analysis Of Momentum In Basketball, Mackenzi Stump Dec 2017

Statistical Analysis Of Momentum In Basketball, Mackenzi Stump

Honors Projects

The “hot hand” in sports has been debated for as long as sports have been around. The debate involves whether streaks and slumps in sports are true phenomena or just simply perceptions in the mind of the human viewer. This statistical analysis of momentum in basketball analyzes the distribution of time between scoring events for the BGSU Women’s Basketball team from 2011-2017. We discuss how the distribution of time between scoring events changes with normal game factors such as location of the game, game outcome, and several other factors. If scoring events during a game were always randomly distributed, or …


Making Models With Bayes, Pilar Olid Dec 2017

Making Models With Bayes, Pilar Olid

Electronic Theses, Projects, and Dissertations

Bayesian statistics is an important approach to modern statistical analyses. It allows us to use our prior knowledge of the unknown parameters to construct a model for our data set. The foundation of Bayesian analysis is Bayes' Rule, which in its proportional form indicates that the posterior is proportional to the prior times the likelihood. We will demonstrate how we can apply Bayesian statistical techniques to fit a linear regression model and a hierarchical linear regression model to a data set. We will show how to apply different distributions to Bayesian analyses and how the use of a prior affects …


Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu Nov 2017

Statistical Modelling, Optimal Strategies And Decisions In Two-Period Economies, Jiang Wu

Electronic Thesis and Dissertation Repository

Motivated by some real problems, our thesis puts forward two general two-period pricing models and explore optimal buying and selling strategies in two states of the two-period decision, when buyer/seller's decisions in the two periods are uncertain: commodity valuations may or may not be independent, may or may not follow the same distribution, be heavily or just lightly influenced by exogenous economic conditions, and so on. For both the example of buying laptops and the example of selling houses, the connections between each example and the two-envelope paradox encourage us to explore optimal strategies based on the works of McDonnell …


Data-Adaptive Kernel Support Vector Machine, Xin Liu Nov 2017

Data-Adaptive Kernel Support Vector Machine, Xin Liu

Electronic Thesis and Dissertation Repository

In this thesis, we propose the data-adaptive kernel Support Vector Machine (SVM), a new method with a data-driven scaling kernel function based on real data sets. This two-stage approach of kernel function scaling can enhance the accuracy of a support vector machine, especially when the data are imbalanced. Followed by the standard SVM procedure in the first stage, the proposed method locally adapts the kernel function to data locations based on the skewness of the class outcomes. In the second stage, the decision rule is constructed with the data-adaptive kernel function and is used as the classifier. This process enlarges …


Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek Aug 2017

Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek

Electronic Theses and Dissertations

ABSTRACT

Examination and Comparison of the Performance of Common Non-Parametric and Robust Regression Models

By

Gregory Frank Malek

Stephen F. Austin State University, Masters in Statistics Program,

Nacogdoches, Texas, U.S.A.

g_m_2002@live.com

This work investigated common alternatives to the least-squares regression method in the presence of non-normally distributed errors. An initial literature review identified a variety of alternative methods, including Theil Regression, Wilcoxon Regression, Iteratively Re-Weighted Least Squares, Bounded-Influence Regression, and Bootstrapping methods. These methods were evaluated using a simple simulated example data set, as well as various real data sets, including math proficiency data, Belgian telephone call data, and faculty …


Using Mountain Snowpack To Predict Summer Water Availability In Semiarid Mountain Watersheds, Rebecca Dawn Garst Aug 2017

Using Mountain Snowpack To Predict Summer Water Availability In Semiarid Mountain Watersheds, Rebecca Dawn Garst

Boise State University Theses and Dissertations

In the mountainous landscapes of the western United States, water resources are dominated by snowpack. As temperatures rise in spring and summer, the melting snow produces an increase in river flow levels. Reservoirs are used during this increase to retain surplus water, which is released to supplement growing season water supply once the peak flows decrease to below water demands. Once there is no longer surplus natural flow of water, the water accounting changes – referred to as the day of allocation (DOA), and water previously retained within the reservoir is used to supplement the lower flow levels. The amount …


Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney Jun 2017

Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney

Computer Science and Software Engineering

Gridiron Gurus is a desktop application that allows for the creation of custom AI profiles to help advise and compete against in a Fantasy Football setting. Our AI are capable of performing statistical prediction of players on both a season long and week to week basis giving them the ability to both draft and manage a fantasy football team throughout a season.


Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers May 2017

Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Survival analysis methods are a mainstay of the biomedical fields but are finding increasing use in other disciplines including finance and engineering. A widely used tool in survival analysis is the Cox proportional hazards regression model. For this model, all the predicted survivor curves have the same basic shape, which may not be a good approximation to reality. In contrast the Random Survival Forests does not make the proportional hazards assumption and has the flexibility to model survivor curves that are of quite different shapes for different groups of subjects. We applied both techniques to a number of publicly available …


A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone May 2017

A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

A community ecologist provided a motivating data set involving a certain animal species with two behavior groups, along with a pairwise genetic distance matrix among individuals. Many community ecologists have analyzed similar data sets with a method known as the Hopkins method, testing for an association between the subject-level covariate (behavior group) and the pairwise distance. This community ecologist wanted to know if they used the Hopkins method, would their results be meaningful? Their question inspired this thesis work, where a different data set was used for confidentiality reasons. Multiple methods (Hopkins method, ADONIS, ANOSIM, and Distance Regression) were used …


A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang May 2017

A Bayesian Variable Selection Method With Applications To Spatial Data, Xiahan Tang

Graduate Theses and Dissertations

This thesis first describes the general idea behind Bayes Inference, various sampling methods based on Bayes theorem and many examples. Then a Bayes approach to model selection, called Stochastic Search Variable Selection (SSVS) is discussed. It was originally proposed by George and McCulloch (1993). In a normal regression model where the number of covariates is large, only a small subset tend to be significant most of the times. This Bayes procedure specifies a mixture prior for each of the unknown regression coefficient, the mixture prior was originally proposed by Geweke (1996). This mixture prior will be updated as data becomes …


Performance Of Imputation Algorithms On Artificially Produced Missing At Random Data, Tobias O. Oketch May 2017

Performance Of Imputation Algorithms On Artificially Produced Missing At Random Data, Tobias O. Oketch

Electronic Theses and Dissertations

Missing data is one of the challenges we are facing today in modeling valid statistical models. It reduces the representativeness of the data samples. Hence, population estimates, and model parameters estimated from such data are likely to be biased.

However, the missing data problem is an area under study, and alternative better statistical procedures have been presented to mitigate its shortcomings. In this paper, we review causes of missing data, and various methods of handling missing data. Our main focus is evaluating various multiple imputation (MI) methods from the multiple imputation of chained equation (MICE) package in the statistical software …


Modelling Cash Crop Growth In Tn, Spencer Weston May 2017

Modelling Cash Crop Growth In Tn, Spencer Weston

Chancellor’s Honors Program Projects

No abstract provided.


Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane Apr 2017

Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane

Theses

Alzheimer Disease (AD) is difficult to diagnose by using genetic testing or other traditional methods. Unlike diseases with simple genetic risk components, there exists no single marker determining as to whether someone will develop AD. Furthermore, AD is highly heterogeneous and different subgroups of individuals develop the disease due to differing factors. Traditional diagnostic methods using perceivable cognitive deficiencies are often too little too late due to the brain having suffered damage from decades of disease progression. In order to observe AD at early stages prior to the observation of cognitive deficiencies, biomarkers with greater accuracy are required. By using …


Inference In Networking Systems With Designed Measurements, Chang Liu Mar 2017

Inference In Networking Systems With Designed Measurements, Chang Liu

Doctoral Dissertations

Networking systems consist of network infrastructures and the end-hosts have been essential in supporting our daily communication, delivering huge amount of content and large number of services, and providing large scale distributed computing. To monitor and optimize the performance of such networking systems, or to provide flexible functionalities for the applications running on top of them, it is important to know the internal metrics of the networking systems such as link loss rates or path delays. The internal metrics are often not directly available due to the scale and complexity of the networking systems. This motivates the techniques of inference …


Further Advances For The Sequential Multiple Assignment Randomized Trial (Smart), Tianjiao Dai Feb 2017

Further Advances For The Sequential Multiple Assignment Randomized Trial (Smart), Tianjiao Dai

Dissertations & Theses (Open Access)

ABSTRACT

FURTHER ADVANCES FOR THE SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZED TRIAL (SMART)

Tianjiao Dai, M.S.

Advisory Professor: Sanjay Shete, Ph.D.

Sequential multiple assignment randomized trial (SMART) designs have been developed these years for studying adaptive interventions. In my Ph.D. study, I mainly investigate how to further improve SMART designs and optimize the interventions for each individual in the trial. My dissertation has focused on two topics of SMART designs.

1) Developing a novel SMART design that can reduce the cost and side effects associated with the interventions and proposing the corresponding analytic methods. I have developed a time-varying SMART design in …


Neural Network Predictions Of A Simulation-Based Statistical And Graph Theoretic Study Of The Board Game Risk, Jacob Munson Jan 2017

Neural Network Predictions Of A Simulation-Based Statistical And Graph Theoretic Study Of The Board Game Risk, Jacob Munson

Murray State Theses and Dissertations

We translate the RISK board into a graph which undergoes updates as the game advances. The dissection of the game into a network model in discrete time is a novel approach to examining RISK. A review of the existing statistical findings of skirmishes in RISK is provided. The graphical changes are accompanied by an examination of the statistical properties of RISK. The game is modeled as a discrete time dynamic network graph, with the various features of the game modeled as properties of the network at a given time. As the network is computationally intensive to implement, results are produced …


An Exploratory Statistical Method For Finding Interactions In A Large Dataset With An Application Toward Periodontal Diseases, Joshua Lambert Jan 2017

An Exploratory Statistical Method For Finding Interactions In A Large Dataset With An Application Toward Periodontal Diseases, Joshua Lambert

Theses and Dissertations--Epidemiology and Biostatistics

It is estimated that Periodontal Diseases effects up to 90% of the adult population. Given the complexity of the host environment, many factors contribute to expression of the disease. Age, Gender, Socioeconomic Status, Smoking Status, and Race/Ethnicity are all known risk factors, as well as a handful of known comorbidities. Certain vitamins and minerals have been shown to be protective for the disease, while some toxins and chemicals have been associated with an increased prevalence. The role of toxins, chemicals, vitamins, and minerals in relation to disease is believed to be complex and potentially modified by known risk factors. A …


Quantifying The Effect Of The Shift In Major League Baseball, Christopher John Hawke Jr. Jan 2017

Quantifying The Effect Of The Shift In Major League Baseball, Christopher John Hawke Jr.

Senior Projects Spring 2017

Baseball is a very strategic and abstract game, but the baseball world is strangely obsessed with statistics. Modern mainstream statisticians often study offensive data, such as batting average or on-base percentage, in order to evaluate player performance. However, this project observes the game from the opposite perspective: the defensive side of the game. In hopes of analyzing the game from a more concrete perspective, countless mathemeticians - most famously, Bill James - have developed numerous statistical models based on real life data of Major League Baseball (MLB) players. Large numbers of metrics go into these models, but what this project …


Nonparametric Compound Estimation, Derivative Estimation, And Change Point Detection, Sisheng Liu Jan 2017

Nonparametric Compound Estimation, Derivative Estimation, And Change Point Detection, Sisheng Liu

Theses and Dissertations--Statistics

Firstly, we reviewed some popular nonparameteric regression methods during the past several decades. Then we extended the compound estimation (Charnigo and Srinivasan [2011]) to adapt random design points and heteroskedasticity and proposed a modified Cp criteria for tuning parameter selection. Moreover, we developed a DCp criteria for tuning paramter selection problem in general nonparametric derivative estimation. This extends GCp criteria in Charnigo, Hall and Srinivasan [2011] with random design points and heteroskedasticity. Next, we proposed a change point detection method via compound estimation for both fixed design and random design case, the adaptation of heteroskedasticity was considered for the method. …


A Markov Decision Process Approach To Adaptive Contact Strategies, Artur Grygorian Jan 2017

A Markov Decision Process Approach To Adaptive Contact Strategies, Artur Grygorian

Electronic Theses and Dissertations

In the field of survey methodology, optimizing contact strategies helps organizations increase response rates using their allocated budget. Markov Decision Processes (MDP) are widely used to model decision-making strategies in situations where the outcomes have a random component. In this research, we use MDPs and adaptive sampling techniques to construct a strategy that, based on target audience characteristics, suggests the best contact policy. The data we use comes from the First Destination Survey conducted by the Office of Career Services at Georgia Southern University. The constructed model is quite flexible and can be used by other organizations to optimize their …


Quasi-Random Action Selection In Markov Decision Processes, Samuel D. Walker Jan 2017

Quasi-Random Action Selection In Markov Decision Processes, Samuel D. Walker

Electronic Theses and Dissertations

In Markov decision processes an operator exploits known data regarding the environment it inhabits. The information exploited is learned from random exploration of the state-action space. This paper proposes to optimize exploration through the implementation of quasi-random sequences in both discrete and continuous state-action spaces. For the discrete case a permutation is applied to the indices of the action space to avoid repetitive behavior. In the continuous case sequences of low discrepancy, such as Halton sequences, are utilized to disperse the actions more uniformly.