Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Prediction

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 66

Full-Text Articles in Statistics and Probability

Population Modeling With Machine Learning Can Enhance Measures Of Mental Health - Open-Data Replication, Ty Easley, Ruiqi Chen, Kayla Hannon, Rosie Dutt, Janine Bijsterbosch Jun 2023

Population Modeling With Machine Learning Can Enhance Measures Of Mental Health - Open-Data Replication, Ty Easley, Ruiqi Chen, Kayla Hannon, Rosie Dutt, Janine Bijsterbosch

Statistical and Data Sciences: Faculty Publications

Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution …


Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan May 2023

Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan

Computer Science Senior Theses

We introduce a framework that combines Gaussian Process models, robotic sensor measurements, and sampling data to predict spatial fields. In this context, a spatial field refers to the distribution of a variable throughout a specific area, such as temperature or pH variations over the surface of a lake. Whereas existing methods tend to analyze only the particular field(s) of interest, our approach optimizes predictions through the effective use of all available data. We validated our framework on several datasets, showing that errors can decline by up to two-thirds through the inclusion of additional colocated measurements. In support of adaptive sampling, …


A Course In Data Science: R And Prediction Modeling, Adam Kapelner May 2022

A Course In Data Science: R And Prediction Modeling, Adam Kapelner

Open Educational Resources

This is a self-contained course in data science and machine learning using R. It covers philosophy of modeling with data, prediction via linear models, machine learning including support vector machines and random forests, probability estimation and asymmetric costs using logistic regression and probit regression, underfitting vs. overfitting, model validation, handling missingness and much more. There is formal instruction of data manipulation using dplyr and data.table, visualization using ggplot2 and statistical computing.


Attempting To Predict The Unpredictable: March Madness, Coleton Kanzmeier May 2022

Attempting To Predict The Unpredictable: March Madness, Coleton Kanzmeier

Theses/Capstones/Creative Projects

Each year, millions upon millions of individuals fill out at least one if not hundreds of March Madness brackets. People test their luck every year, whether for fun, with friends or family, or to even win some money. Some people rely on their basketball knowledge whereas others know it is called March Madness for a reason and take a shot in the dark. Others have even tried using statistics to give them an edge. I intend to follow a similar approach, using statistics to my advantage. The end goal is to predict this year’s, 2022, March Madness bracket. To achieve …


Penalized Estimation Of Autocorrelation, Xiyan Tan May 2022

Penalized Estimation Of Autocorrelation, Xiyan Tan

All Dissertations

This dissertation explored the idea of penalized method in estimating the autocorrelation (ACF) and partial autocorrelation (PACF) in order to solve the problem that the sample (partial) autocorrelation underestimates the magnitude of (partial) autocorrelation in stationary time series. Although finite sample bias corrections can be found under specific assumed models, no general formulae are available. We introduce a novel penalized M-estimator for (partial) autocorrelation, with the penalty pushing the estimator toward a target selected from the data. This both encapsulates and differs from previous attempts at penalized estimation for autocorrelation, which shrink the estimator toward the target value of zero. …


Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii May 2022

Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii

Undergraduate Honors Theses

Intraday stock trading is an infamously difficult and risky strategy. Momentum and reversal strategies and long short-term memory (LSTM) neural networks have been shown to be effective for selecting stocks to buy and sell over time periods of multiple days. To explore whether these strategies can be effective for intraday trading, their implementations were simulated using intraday price data for stocks in the S&P 500 index, collected at 1-second intervals between February 11, 2021 and March 9, 2021 inclusive. The study tested 160 variations of momentum and reversal strategies for profitability in long, short, and market-neutral portfolios, totaling 480 portfolios. …


Flexible Modelling Of Time-Dependent Covariate Effects With Correlated Competing Risks: Application To Hereditary Breast And Ovarian Cancer Families, Seungwoo Lee Apr 2022

Flexible Modelling Of Time-Dependent Covariate Effects With Correlated Competing Risks: Application To Hereditary Breast And Ovarian Cancer Families, Seungwoo Lee

Electronic Thesis and Dissertation Repository

This thesis aims to develop a flexible approach for modelling time-dependent covariate effects on event risk using B-splines in the presence of correlated competing risks. The performance of the proposed model was evaluated via simulation in terms of the bias and precision of the estimation of the parameters and penetrance functions. In addition, we extended the concordance index to account for time-dependent effects and competing events simultaneously and demonstrated its inference procedures. We applied our proposed methods to data rising from the BRCA1 mutation families from the breast cancer family registry to evaluate the time-dependent effects of mammographic screening and …


Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell May 2021

Statistical Analysis Of 2017-18 Premier League Match Statistics Using A Regression Analysis In R, Bergen Campbell

Undergraduate Theses and Capstone Projects

This thesis analyzes the correlation between a team’s statistics and the success of their performances, and develops a predictive model that can be used to forecast final season results for that team. Data from the 2017-2018 Premier League season is to be gathered and broken down within R to highlight what factors and variables are largely contributing to the success or downfall of a team. A multiple linear regression model and stepwise selection process is then used to include any factors that are significant in predicting in match results.

The predictions about the 17-18 season results based on the model …


Regression Modeling And Prediction By Individual Observations Versus Frequency, Stan Lipovetsky Feb 2020

Regression Modeling And Prediction By Individual Observations Versus Frequency, Stan Lipovetsky

Journal of Modern Applied Statistical Methods

A regression model built by a dataset could sometimes demonstrate a low quality of fit and poor predictions of individual observations. However, using the frequencies of possible combinations of the predictors and the outcome, the same models with the same parameters may yield a high quality of fit and precise predictions for the frequencies of the outcome occurrence. Linear and logistical regressions are used to make an explicit exposition of the results of regression modeling and prediction.


Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma Dec 2019

Generalized Matrix Decomposition Regression: Estimation And Inference For Two-Way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, Jing Ma

UW Biostatistics Working Paper Series

Analysis of two-way structured data, i.e., data with structures among both variables and samples, is becoming increasingly common in ecology, biology and neuro-science. Classical dimension-reduction tools, such as the singular value decomposition (SVD), may perform poorly for two-way structured data. The generalized matrix decomposition (GMD, Allen et al., 2014) extends the SVD to two-way structured data and thus constructs singular vectors that account for both structures. While the GMD is a useful dimension-reduction tool for exploratory analysis of two-way structured data, it is unsupervised and cannot be used to assess the association between such data and an outcome of interest. …


Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan Aug 2019

Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan

SMU Data Science Review

In this paper, we present novel approaches to predicting as- set failure in the electric distribution system. Failures in overhead power lines and their associated equipment in particular, pose significant finan- cial and environmental threats to electric utilities. Electric device failure furthermore poses a burden on customers and can pose serious risk to life and livelihood. Working with asset data acquired from an electric utility in Southern California, and incorporating environmental and geospatial data from around the region, we applied a Random Forest methodology to predict which overhead distribution lines are most vulnerable to fail- ure. Our results provide evidence …


A Bayesian Framework For Estimating Seismic Wave Arrival Time, Hua Zhong May 2019

A Bayesian Framework For Estimating Seismic Wave Arrival Time, Hua Zhong

Graduate Theses and Dissertations

Because earthquakes have a large impact on human society, statistical methods for better studying earthquakes are required. One characteristic of earthquakes is the arrival time of seismic waves at a seismic signal sensor. Once we can estimate the earthquake arrival time accurately, the earthquake location can be triangulated, and assistance can be sent to that area correctly. This study presents a Bayesian framework to predict the arrival time of seismic waves with associated uncertainty. We use a change point framework to model the different conditions before and after the seismic wave arrives. To evaluate the performance of the model, we …


Genomic Prediction Using Canopy Coverage Image And Genotypic Information In Soybean Via A Hybrid Model, Reka Howard, Diego Jarquin Jan 2019

Genomic Prediction Using Canopy Coverage Image And Genotypic Information In Soybean Via A Hybrid Model, Reka Howard, Diego Jarquin

Department of Statistics: Faculty Publications

Prediction techniques are important in plant breeding as they provide a tool for selection that is more efficient and economical than traditional phenotypic and pedigree based selection. The conventional genomic prediction models include molecular marker information to predict the phenotype. With the development of new phenomics techniques we have the opportunity to collect image data on the plants, and extend the traditional genomic prediction models where we incorporate diverse set of information collected on the plants. In our research, we developed a hybrid matrix model that incorporates molecular marker and canopy coverage information as a weighted linear combination to predict …


A Tacticians Guide To Conflict, Vol. 1: Advancing Explanations & Predictions Of Intrastate Conflict, Khaled Eid Jan 2019

A Tacticians Guide To Conflict, Vol. 1: Advancing Explanations & Predictions Of Intrastate Conflict, Khaled Eid

CGU Theses & Dissertations

Intrastate conflict is an ever-evolving problem – causes, explanation, and predictions are increasingly murky as traditional methods of analysis focus on structural issues as precursors of conflict. Often times these theories do not consider the underlying meso and micro dynamics that can provide vital insights into the phenomena. Tactical decision-makers are left using models that rely on highly aggregated, country level data to create proper courses of actions (COAs) to address or predict conflict. The shortcoming is that conflicts morph quite rapidly and structural variables can struggle capture such dynamic changes. To address this some tacticians are using big data …


Microarray Data Analysis And Classification Of Cancers, Grant Gates Jan 2019

Microarray Data Analysis And Classification Of Cancers, Grant Gates

Williams Honors College, Honors Research Projects

When it comes to cancer, there is no standardized approach for identifying new cancer classes nor is there a standardized approach for assigning cancer tumors to existing classes. These two ideas are known as class discovery and class prediction. For a cancer patient to receive proper treatment, it is important that the type of cancer be accurately identified. For my Senior Honors Project, I would like to use this opportunity to research a topic in bioinformatics. Bioinformatics incorporates a few different subjects into one including biology, computer science and statistics. An intricate method for class discovery and class prediction is …


Development And Internal Validation Of An Aneurysm Rupture Probability Model Based On Patient Characteristics And Aneurysm Location, Morphology, And Hemodynamics, Felicitas J. Detmer, Bong Jae Chung, Fernando Mut, Martin Slawski, Farid Hamzei-Sichani, Christopher Putman, Carlos Jiménez, Juan R. Cebral Nov 2018

Development And Internal Validation Of An Aneurysm Rupture Probability Model Based On Patient Characteristics And Aneurysm Location, Morphology, And Hemodynamics, Felicitas J. Detmer, Bong Jae Chung, Fernando Mut, Martin Slawski, Farid Hamzei-Sichani, Christopher Putman, Carlos Jiménez, Juan R. Cebral

Department of Applied Mathematics and Statistics Faculty Scholarship and Creative Works

Purpose: Unruptured cerebral aneurysms pose a dilemma for physicians who need to weigh the risk of a devastating subarachnoid hemorrhage against the risk of surgery or endovascular treatment and their complications when deciding on a treatment strategy. A prediction model could potentially support such treatment decisions. The aim of this study was to develop and internally validate a model for aneurysm rupture based on hemodynamic and geometric parameters, aneurysm location, and patient gender and age. Methods: Cross-sectional data from 1061 patients were used for image-based computational fluid dynamics and shape characterization of 1631 aneurysms for training an aneurysm rupture probability …


Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak Oct 2018

Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak

Masters Theses

Dengue fever affects over 390 million people annually worldwide and is of particu- lar concern in Southeast Asia where it is one of the leading causes of hospitalization. Modeling trends in dengue occurrence can provide valuable information to Public Health officials, however many challenges arise depending on the data available. In Thailand, reporting of dengue cases is often delayed by more than 6 weeks, and a small fraction of cases may not be reported until over 11 months after they occurred. This study shows that incorporating data on Google Search trends can improve dis- ease predictions in settings with severely …


Identifying Key Factors Associated With High Risk Asthma Patients To Reduce The Cost Of Health Resources Utilization, Amani Ahmad Oct 2018

Identifying Key Factors Associated With High Risk Asthma Patients To Reduce The Cost Of Health Resources Utilization, Amani Ahmad

LSU Master's Theses

Asthma is associated with frequent use of primary health services and places a burden on the United States economy. Identifying key factors associated with increased cost of asthma is an essential step to improve practices of asthma management.

The aim of this study was to identify factors associated with over utilization of primary health services and increased cost via claims data and to explore the effectiveness of case management program in reducing overall asthma related cost.

Claims data analysis for Medicaid insured asthma patients in Louisiana was conducted. Asthma patients were identified using their ICD-9 and ICD-10 codes, forward variable …


Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, Alfeo Sabay, Laurie Harris, Vivek Bejugama, Karen Jaceldo-Siegl Aug 2018

Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, Alfeo Sabay, Laurie Harris, Vivek Bejugama, Karen Jaceldo-Siegl

SMU Data Science Review

In this paper, we present a heart disease prediction use case showing how synthetic data can be used to address privacy concerns and overcome constraints inherent in small medical research data sets. While advanced machine learning algorithms, such as neural networks models, can be implemented to improve prediction accuracy, these require very large data sets which are often not available in medical or clinical research. We examine the use of surrogate data sets comprised of synthetic observations for modeling heart disease prediction. We generate surrogate data, based on the characteristics of original observations, and compare prediction accuracy results achieved from …


Development Of A Statistical Model For Discrimination Of Rupture Status In Posterior Communicating Artery Aneurysms, Felicitas J. Detmer, Bong Jae Chung, Fernando Mut, Michael Pritz, Martin Slawski, Farid Hamzei-Sichani, David Kallmes, Christopher Putman, Carlos Jimenez, Juan R. Cebral Aug 2018

Development Of A Statistical Model For Discrimination Of Rupture Status In Posterior Communicating Artery Aneurysms, Felicitas J. Detmer, Bong Jae Chung, Fernando Mut, Michael Pritz, Martin Slawski, Farid Hamzei-Sichani, David Kallmes, Christopher Putman, Carlos Jimenez, Juan R. Cebral

Department of Applied Mathematics and Statistics Faculty Scholarship and Creative Works

Background: Intracranial aneurysms at the posterior communicating artery (PCOM) are known to have high rupture rates compared to other locations. We developed and internally validated a statistical model discriminating between ruptured and unruptured PCOM aneurysms based on hemodynamic and geometric parameters, angio-architectures, and patient age with the objective of its future use for aneurysm risk assessment. Methods: A total of 289 PCOM aneurysms in 272 patients modeled with image-based computational fluid dynamics (CFD) were used to construct statistical models using logistic group lasso regression. These models were evaluated with respect to discrimination power and goodness of fit using tenfold nested …


The Accuracy, Fairness, And Limits Of Predicting Recidivism, Julie Dressel, Hany Farid Jan 2018

The Accuracy, Fairness, And Limits Of Predicting Recidivism, Julie Dressel, Hany Farid

Dartmouth Scholarship

Algorithms for predicting recidivism are commonly used to assess a criminal defendant’s likelihood of committing a crime. These predictions are used in pretrial, parole, and sentencing decisions. Proponents of these systems argue that big data and advanced machine learning make these analyses more accurate and less biased than humans. We show, however, that the widely used commercial risk assessment software COMPAS is no more accurate or fair than predictions made by people with little or no criminal justice expertise. We further show that a simple linear predictor provided with only two features is nearly equivalent to COMPAS with its 137 …


A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen Aug 2017

A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The health of freshwater aquatic systems, particularly stream networks, is mainly influenced by water temperature, which controls biological processes and influences species distributions and aquatic biodiversity. Thermal regimes of rivers are likely to change in the future, due to climate change and other anthropogenic impacts, and our ability to predict stream temperatures will be critical in understanding distribution shifts of aquatic biota. Spatial statistical network models take into account spatial relationships but have drawbacks, including high computation times and data pre-processing requirements. Machine learning techniques and generalized additive models (GAM) are promising alternatives to the SSN model. Two machine learning …


Developing Biomarker Combinations In Multicenter Studies Via Direct Maximization And Penalization, Allison Meisner, Chirag R. Parikh, Kathleen F. Kerr Jul 2017

Developing Biomarker Combinations In Multicenter Studies Via Direct Maximization And Penalization, Allison Meisner, Chirag R. Parikh, Kathleen F. Kerr

UW Biostatistics Working Paper Series

When biomarker studies involve patients at multiple centers and the goal is to develop biomarker combinations for diagnosis, prognosis, or screening, we consider evaluating the predictive capacity of a given combination with the center-adjusted AUC (aAUC), a summary of conditional performance. Rather than using a general method to construct the biomarker combination, such as logistic regression, we propose estimating the combination by directly maximizing the aAUC. Furthermore, it may be desirable to have a biomarker combination with similar predictive capacity across centers. To that end, we allow for penalization of the variability in center-specific performance. We demonstrate good asymptotic properties …


Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney Jun 2017

Gridiron-Gurus Final Report: Fantasy Football Performance Prediction, Kyle Tanemura, Michael Li, Erica Dorn, Ryan Mckinney

Computer Science and Software Engineering

Gridiron Gurus is a desktop application that allows for the creation of custom AI profiles to help advise and compete against in a Fantasy Football setting. Our AI are capable of performing statistical prediction of players on both a season long and week to week basis giving them the ability to both draft and manage a fantasy football team throughout a season.


Predicting Future Years Of Life, Health, And Functional Ability: A Healthy Life Calculator For Older Adults, Paula Diehr, Michael Diehr, Alice M. Arnold, Laura Yee, Michelle C. Odden, Calvin H. Hirsch, Stephen Thielke, Bruce Psaty, W Craig Johnson, Jorge Kizer, Anne B. Newman Jan 2017

Predicting Future Years Of Life, Health, And Functional Ability: A Healthy Life Calculator For Older Adults, Paula Diehr, Michael Diehr, Alice M. Arnold, Laura Yee, Michelle C. Odden, Calvin H. Hirsch, Stephen Thielke, Bruce Psaty, W Craig Johnson, Jorge Kizer, Anne B. Newman

UW Biostatistics Working Paper Series

Introduction

Planning for the future would be easier if we knew how long we will live and, more importantly, how many years we will be healthy and able to enjoy it. There are few well-documented aids for predicting our future health. We attempted to meet this need for persons 65 years of age and older.

Methods

Data came from the Cardiovascular Health Study, a large longitudinal study of older adults that began in 1990. Years of life (YOL) were defined by measuring time to death. Years of healthy life (YHL) were defined by an annual question about self-rated health, and …


Longitudinal Measurement And Hierarchical Classification Framework For The Prediction Of Alzheimer's Disease, Meiyan Huang, Wei Yang, Qianjin Feng, Wufan Chen, Michael Weiner, Paul Aisen, Ronald Petersen, Clifford R. Jack Jr., William Jagust, John Trojanowki, Arthur W. Toga, Laurel Beckett, Robert C. Green, Andrew Saykin, John Morris, Leslie M. Shaw, Jeffrey Kaye, Joseph Quinn, Lisa Silbert, Betty Lind, Raina Carter, Sara Dolen, Lon S. Schneider, Sonia Pawluczyk, Mauricio Beccera, Liberty Teodoro, Bryan Spann, James Brewer, Helen Vanderswag, Adam Fleisher, Charles D. Smith, Greg A. Jicha, Peter A. Hardy, Partha Sinha, Elizabeth Oates, Gary Conrad Jan 2017

Longitudinal Measurement And Hierarchical Classification Framework For The Prediction Of Alzheimer's Disease, Meiyan Huang, Wei Yang, Qianjin Feng, Wufan Chen, Michael Weiner, Paul Aisen, Ronald Petersen, Clifford R. Jack Jr., William Jagust, John Trojanowki, Arthur W. Toga, Laurel Beckett, Robert C. Green, Andrew Saykin, John Morris, Leslie M. Shaw, Jeffrey Kaye, Joseph Quinn, Lisa Silbert, Betty Lind, Raina Carter, Sara Dolen, Lon S. Schneider, Sonia Pawluczyk, Mauricio Beccera, Liberty Teodoro, Bryan Spann, James Brewer, Helen Vanderswag, Adam Fleisher, Charles D. Smith, Greg A. Jicha, Peter A. Hardy, Partha Sinha, Elizabeth Oates, Gary Conrad

Neurology Faculty Publications

Accurate prediction of Alzheimer’s disease (AD) is important for the early diagnosis and treatment of this condition. Mild cognitive impairment (MCI) is an early stage of AD. Therefore, patients with MCI who are at high risk of fully developing AD should be identified to accurately predict AD. However, the relationship between brain images and AD is difficult to construct because of the complex characteristics of neuroimaging data. To address this problem, we present a longitudinal measurement of MCI brain images and a hierarchical classification method for AD prediction. Longitudinal images obtained from individuals with MCI were investigated to acquire important …


A Bayes Interpretation Of Stacking For M-Complete And M-Open Settings, Tri Le, Bertrand S. Clarke Jan 2017

A Bayes Interpretation Of Stacking For M-Complete And M-Open Settings, Tri Le, Bertrand S. Clarke

Department of Statistics: Faculty Publications

In M-open problems where no true model can be conceptualized, it is common to back off from modeling and merely seek good prediction. Even in M-complete problems, taking a predictive approach can be very useful. Stacking is a model averaging procedure that gives a composite predictor by combining individual predictors from a list of models using weights that optimize a cross validation criterion. We show that the stacking weights also asymptotically minimize a posterior expected loss. Hence we formally provide a Bayesian justification for cross-validation. Often the weights are constrained to be positive and sum to one. For greater generality, …


Predictive Modeling Of Adolescent Cannabis Use From Multimodal Data, Philip Spechler Jan 2017

Predictive Modeling Of Adolescent Cannabis Use From Multimodal Data, Philip Spechler

Graduate College Dissertations and Theses

Predicting teenage drug use is key to understanding the etiology of substance abuse. However, classic predictive modeling procedures are prone to overfitting and fail to generalize to independent observations. To mitigate these concerns, cross-validated logistic regression with elastic-net regularization was used to predict cannabis use by age 16 from a large sample of fourteen year olds (N=1,319). High-dimensional data (p = 2,413) including parent and child psychometric data, child structural and functional MRI data, and genetic data (candidate single-nucleotide polymorphisms, "SNPs") collected at age 14 were used to predict the initiation of cannabis use (minimum six occasions) by age 16. …


Optimal Strategy For Gambling Pools, Aaron C. Brown Jun 2016

Optimal Strategy For Gambling Pools, Aaron C. Brown

International Conference on Gambling & Risk Taking

In gambling pools, entrants submit predictions and the prizes are awarded to the prediction or predictions closest to actual outcomes. Some well-known examples are football pools (both the global and American game versions), toto, NCAA March Madness bracket pools and horse racing tournaments. For small pools with complete information about outcome probabilities, exact game theory optimal solutions are straightforward to compute. If there is also complete information about the number and strategy of other players, optimal exploitive strategies are even easier to derive. These problems have been treated in the literature.

This paper argues that the complete information approaches are …


Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma Jun 2016

Well I'Ll Be Damned - Insights Into Predictive Value Of Pedigree Information In Horse Racing, Timothy Baker Mr, Ming-Chien Sung, Johnnie Johnson Professor, Tiejun Ma

International Conference on Gambling & Risk Taking

Fundamental form characteristics like how fast a horse ran at its last start, are widely used to help predict the outcome of horse racing events. The exception being in races where horses haven’t previously competed, such as Maiden races, where there is little or no publicly available past performance information. In these types of events bettors need only consider a simplified suite of factors however this is offset by a higher level of uncertainty. This paper examines the inherent information content embedded within a horse’s ancestry and the extent to which this information is discounted in the United Kingdom bookmaker …