# Probability Commons™

Articles 1 - 30 of 315

## Full-Text Articles in Probability

#### Characterizing The Permanence And Stationary Distribution For A Family Of Malaria Stochastic Models, Divine Wanduku

##### Biology and Medicine Through Mathematics Conference

No abstract provided.

May 2019

#### Characterizing The Tails Of Degree Distributions In Real-World Networks, Anna Broido

##### Applied Mathematics Graduate Theses & Dissertations

This is a thesis about how to characterize the statistical structure of the tails of degree distributions of real-world networks. The primary contribution is a statistical test of the prevalence of scale-free structure in real-world networks. A central claim in modern network science is that real-world networks are typically "scale free," meaning that the fraction of nodes with degree k follows a power law, decaying like k-a, often with 2 < a< 3. However, empirical evidence for this belief derives from a relatively small number of real-world networks. In the first section, we test the universality of scale-free structure by applying state-of-the-art statistical tools to a large corpus of nearly 1000 network data sets drawn from social, biological, technological, and informational sources. We fit the power-law model to each degree distribution, test its statistical plausibility, and compare it via a likelihood ratio test to alternative, non-scale-free models, e.g., the log-normal. Across domains, we find that scale-free networks are rare, with only 4% exhibiting the strongest-possible evidence of scale-free structure and 52% exhibiting the weakest-possible evidence. Furthermore, evidence of scale-free structure is not uniformly distributed across sources: social networks are at best weakly scale free, while a handful of technological and biological networks can be called strongly scale free. These results undermine the universality of scale-free networks and reveal that real-world networks exhibit a rich structural diversity that will likely require new ideas and mechanisms to explain. A core methodological component of addressing the ubiquity of scale-free structure in real-world networks is an ability to fit a power law to the degree distribution. In the second section, we numerically evaluate and compare, using both synthetic data with known structure and real-world data with unknown structure, two statistically principled methods for estimating the tail parameters for power-law distributions, showing that in practice, a method based on extreme value theory and a sophisticated bootstrap and the more commonly used method based an empirical minimization approach exhibit similar accuracy.

Optimal Conditional Expectation At The Video Poker Game Jacks Or Better, Stewart N. Ethier, John J. Kim, Jiyeon Lee Mar 2019

#### Optimal Conditional Expectation At The Video Poker Game Jacks Or Better, Stewart N. Ethier, John J. Kim, Jiyeon Lee

##### UNLV Gaming Research & Review Journal

There are 134,459 distinct initial hands at the video poker game Jacks or Better, taking suit exchangeability into account. A computer program can determine the optimal strategy (i.e., which cards to hold) for each such hand, but a complete list of these strategies would require a book-length manuscript. Instead, a hand-rank table, which fits on a single page and reproduces the optimal strategy perfectly, was found for Jacks or Better as early as the mid 1990s. Is there a systematic way to derive such a hand-rank table? We show that there is indeed, and it involves finding the ...

Surprise Vs. Probability As A Metric For Proof, Edward K. Cheng, Matthew Ginther Mar 2019

#### Surprise Vs. Probability As A Metric For Proof, Edward K. Cheng, Matthew Ginther

##### Edward Cheng

In this Symposium issue celebrating his career, Professor Michael Risinger in Leveraging Surprise proposes using "the fundamental emotion of surprise" as a way of measuring belief for purposes of legal proof. More specifically, Professor Risinger argues that we should not conceive of the burden of proof in terms of probabilities such as 51%, 95%, or even "beyond a reasonable doubt." Rather, the legal system should reference the threshold using "words of estimative surprise" -asking jurors how surprised they would be if the fact in question were not true. Toward this goal (and being averse to cardinality), he suggests categories such ...

Feb 2019

#### One-Dimensional Excited Random Walk With Unboundedly Many Excitations Per Site, Omar Chakhtoun

##### All Dissertations, Theses, and Capstone Projects

We study a discrete time excited random walk on the integers lattice requiring a tail decay estimate on the number of excitations per site and extend the existing framework, methods, and results to a wider class of excited random walks.

We give criteria for recurrence versus transience, ballisticity versus zero linear speed, completely classify limit laws in the transient regime, and establish a functional limit laws in the recurrence regime.

Infinite Sums, Products, And Urn Models, Yiyan Ni Jan 2019

#### Infinite Sums, Products, And Urn Models, Yiyan Ni

##### Major Papers

This paper considers an urn and its evolution in discrete time steps. The

urn initially has two different colored balls(blue and red). We discuss different

cases where k blue balls (k = 1, 2, 3, ... ) will be added (or removed) at every

step if a blue ball is withdrawn, based on the goal of eventually withdrawing a

red ball P(R eventually). We compute the probability of eventually withdrawing

a red ball with two different methods–one using infinite sums and other using

infinite products. One advantage of this is that we can obtain P(R eventually) in

a complex ...

#### Analysis Of Ranked Gene Tree Probability Distributions Under The Coalescent Process For Detecting Anomaly Zones, Anastasiia Kim

##### Shared Knowledge Conference

In phylogenetic studies, gene trees are used to reconstruct species tree. Under the multispecies coalescent model, gene trees topologies may differ from that of species trees. The incorrect gene tree topology (one that does not match the species tree) that is more probable than the correct one is termed anomalous gene tree (AGT). Species trees that can generate such AGTs are said to be in the anomaly zone (AZ). In this region, the method of choosing the most common gene tree as the estimate of the species tree will be inconsistent and will converge to an incorrect species tree when ...

Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, Amirfarrokh Iranitalab Nov 2018

#### Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, Amirfarrokh Iranitalab

##### Civil Engineering Theses, Dissertations, and Student Research

Transportation of hazardous materials (hazmat) in the United States (U.S.) constituted 22.8% of the total tonnage transported in 2012 with an estimated value of more than 2.3 billion dollars. As such, hazmat transportation is a significant economic activity in the U.S. However, hazmat transportation exposes people and environment to the infrequent but potentially severe consequences of incidents resulting in hazmat release. Trucks and trains carried 63.7% of the hazmat in the U.S. in 2012 and are the major foci of this dissertation. The main research objectives were 1) identification and quantification of the effects ...

Probabilities Involving Standard Trirectangular Tetrahedral Dice Rolls, Rulon Olmstead, Doneliezer Baize Oct 2018

#### Probabilities Involving Standard Trirectangular Tetrahedral Dice Rolls, Rulon Olmstead, Doneliezer Baize

The goal is to be able to calculate probabilities involving irregular shaped dice rolls. Here it is attempted to model the probabilities of rolling standard tri-rectangular tetrahedral dice on a hard surface, such as a table top. The vertices and edges of a tetrahedron were projected onto the surface of a sphere centered at the center of mass of the tetrahedron. By calculating the surface areas bounded by the resultant geodesics, baseline probabilities were achieved. Using a 3D printer, dice were constructed of uniform density and the results of rolling them were recorded. After calculating the corresponding confidence intervals, the ...

Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri Oct 2018

#### Season-Ahead Forecasting Of Water Storage And Irrigation Requirements – An Application To The Southwest Monsoon In India, Arun Ravindranath, Naresh Devineni, Upmanu Lall, Paulina Concha Larrauri

##### Publications and Research

Water risk management is a ubiquitous challenge faced by stakeholders in the water or agricultural sector. We present a methodological framework for forecasting water storage requirements and present an application of this methodology to risk assessment in India. The application focused on forecasting crop water stress for potatoes grown during the monsoon season in the Satara district of Maharashtra. Pre-season large-scale climate predictors used to forecast water stress were selected based on an exhaustive search method that evaluates for highest ranked probability skill score and lowest root-mean-squared error in a leave-one-out cross-validation mode. Adaptive forecasts were made in the years ...

Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels Aug 2018

#### Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

##### SMU Data Science Review

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as recommended or non-recommended affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that ...

Aug 2018

#### Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, Aisha Asiri

##### Electronic Theses & Dissertations Collection for Atlanta University & Clark Atlanta University

This thesis presents various models to the fashion industry to predict the profits for some products. To determine the expected performance of each product in 2016, we used tools of game theory to help us identify the expected value. We went further and performed a simple linear regression and used scatter plots to help us predict further the performance of the products of Prada. We used tools of game theory, analytics, and statistics to help us predict the performance of some of Prada's products. We also used the Tableau platform to visualize an overview of the products' performances. All ...

Aug 2018

#### The Expected Number Of Patterns In A Random Generated Permutation On [N] = {1,2,...,N}, Evelyn Fokuoh

##### Electronic Theses and Dissertations

Previous work by Flaxman (2004) and Biers-Ariel et al. (2018) focused on the number of distinct words embedded in a string of words of length n. In this thesis, we will extend this work to permutations, focusing on the maximum number of distinct permutations contained in a permutation on [n] = {1,2,...,n} and on the expected number of distinct permutations contained in a random permutation on [n]. We further considered the problem where repetition of subsequences are as a result of the occurrence of (Type A and/or Type B) replications. Our method of enumerating the Type A replications ...

Aug 2018

#### Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister

##### Electronic Theses and Dissertations

A probability distribution is a statistical function that describes the probability of possible outcomes in an experiment or occurrence. There are many different probability distributions that give the probability of an event happening, given some sample size n. An important question in statistics is to determine the distribution of the sum of independent random variables when the sample size n is fixed. For example, it is known that the sum of n independent Bernoulli random variables with success probability p is a Binomial distribution with parameters n and p: However, this is not true when the sample size is not ...

Excess Versions Of The Minkowski And Hölder Inequalities, Iosif Pinelis Jul 2018

#### Excess Versions Of The Minkowski And Hölder Inequalities, Iosif Pinelis

##### Iosif Pinelis

No abstract provided.

Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma Jul 2018

#### Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma

##### Stevenson Center for Community and Economic Development to Stevenson Center for Community and Economic Development—Student Research

Actuarial risk assessment tools increasingly have been employed in jurisdictions across the U.S. to assist courts in the decision of whether someone charged with a crime should be detained or released prior to their trial. These tools should be continually monitored and researched by independent 3rd parties to ensure that these powerful tools are being administered properly and used in the most proficient way as to provide socially optimal results. McLean County, Illinois began using the Public Safety Assessment-CourtTM (PSA-Court or simply PSA) risk assessment tool beginning in 2016. This study culls data from the McLean County ...

#### The Statistical Exploration In The $G$-Expectation Framework: The Pseudo Simulation And Estimation Of Variance Uncertainty, Yifan Li

##### Electronic Thesis and Dissertation Repository

The $G$-expectation framework, motivated by problems with \emph{uncertainty}, is a new generalization of the classical probability framework. Similar to the Choquet expectation, the $G$-expectation can be represented as the supremum of a class of linear expectations. In the past two decades, it has developed into a complete stochastic structure connected with a large family of nonlinear PDEs. Nonetheless, to apply it to real-world problems with uncertainty, it is fundamentally necessary to build up the associated statistical methodology.

This thesis explores the \emph{computation, simulation, and estimation} of the $G$-normal distribution (a typical distribution with variance uncertainty ...

On N/P-Asymptotic Distribution Of Vector Of Weighted Traces Of Powers Of Wishart Matrices, Jolanta Maria Pielaszkiewicz, Dietrich Von Rosen, Martin Singull Jul 2018

#### On N/P-Asymptotic Distribution Of Vector Of Weighted Traces Of Powers Of Wishart Matrices, Jolanta Maria Pielaszkiewicz, Dietrich Von Rosen, Martin Singull

##### Electronic Journal of Linear Algebra

The joint distribution of standardized traces of $\frac{1}{n}XX'$ and of $\Big(\frac{1}{n}XX'\Big)^2$, where the matrix $X:p\times n$ follows a matrix normal distribution is proved asymptotically to be multivariate normal under condition $\frac{{n}}{p}\overset{n,p\rightarrow\infty}{\rightarrow}c>0$. Proof relies on calculations of asymptotic moments and cumulants obtained using a recursive formula derived in Pielaszkiewicz et al. (2015). The covariance matrix of the underlying vector is explicitely given as a function of $n$ and $p$.

Mixed Logical And Probabilistic Reasoning In The Game Of Clue, Todd W. Neller, Ziqian Luo Jul 2018

#### Mixed Logical And Probabilistic Reasoning In The Game Of Clue, Todd W. Neller, Ziqian Luo

##### Computer Science Faculty Publications

Neller and Ziqian Luo ’18 presented a means of mixed logical and probabilistic reasoning with knowledge in the popular deductive mystery game Clue. Using at-least constraints, we more efficiently represented and reasoned about cardinality constraints on Clue card deal knowledge, and then employed a WalkSAT-based solution sampling algorithm with a tabu search metaheuristic in order to estimate the probabilities of unknown card places.

May 2018

#### Deep Learning Analysis Of Limit Order Book, Xin Xu

##### Arts & Sciences Electronic Theses and Dissertations

In this paper, we build a deep neural network for modeling spatial structure in limit order book and make prediction for future best ask or best bid price based on ideas of (Sirignano 2016). We propose an intuitive data processing method to approximate the data is non-available for us based only on level I data that is more widely available. The model is based on the idea that there is local dependence for best ask or best bid price and sizes of related orders. First we use logistic regression to prove that this approach is reasonable. To show the advantages ...

Golden Arm: A Probabilistic Study Of Dice Control In Craps, Donald R. Smith, Robert Scott Iii May 2018

#### Golden Arm: A Probabilistic Study Of Dice Control In Craps, Donald R. Smith, Robert Scott Iii

##### UNLV Gaming Research & Review Journal

This paper calculates how much control a craps shooter must possess on dice outcomes to eliminate the house advantage. A golden arm is someone who has dice control (or a rhythm roller or dice influencer). There are various strategies for dice control in craps. We discuss several possibilities of dice control that would result in several different mathematical models of control. We do not assert whether dice control is possible or not (there is a lack of published evidence). However, after studying casino-legal methods described by dice-control advocates, we can see only one realistic mathematical model that describes the resulting ...

Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen May 2018

#### Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen

##### Electronic Theses and Dissertations

The bootstrap procedure is widely used in nonparametric statistics to generate an empirical sampling distribution from a given sample data set for a statistic of interest. Generally, the results are good for location parameters such as population mean, median, and even for estimating a population correlation. However, the results for a population variance, which is a spread parameter, are not as good due to the resampling nature of the bootstrap method. Bootstrap samples are constructed using sampling with replacement; consequently, groups of observations with zero variance manifest in these samples. As a result, a bootstrap variance estimator will carry a ...

On Passing The Buck, Adam J. Hammett, Anna Joy Yang Apr 2018

#### On Passing The Buck, Adam J. Hammett, Anna Joy Yang

##### The Research and Scholarship Symposium

Imagine there are n>1 people seated around a table, and person S starts with a fair coin they will flip to decide whom to hand the coin next -- if "heads" they pass right, and if "tails" they pass left. This process continues until all people at the table have "touched" the coin. Curiously, it turns out that all people seated at the table other than S have the same probability 1/(n-1) of being last to touch the coin. In fact, Lovasz and Winkler ("A note on the last new vertex visited by a random walk," J. Graph Theory ...

The Devil You Don’T Know: A Spatial Analysis Of Crime At Newark’S Prudential Center On Hockey Game Days, Justin Kurland, Eric Piza Apr 2018

#### The Devil You Don’T Know: A Spatial Analysis Of Crime At Newark’S Prudential Center On Hockey Game Days, Justin Kurland, Eric Piza

##### Journal of Sport Safety and Security

Inspired by empirical research on spatial crime patterns in and around sports venues in the United Kingdom, this paper sought to measure the criminogenic extent of 216 hockey games that took place at the Prudential Center in Newark, NJ between 2007-2016. Do games generate patterns of crime in the areas beyond the arena, and if so, for what type of crime and how far? Police-recorded data for Newark are examined using a variety of exploratory methods and non-parametric permutation tests to visualize differences in crime patterns between game and non-game days across all of Newark and the downtown area. Change ...

#### Network Structure Sampling In Bayesian Networks Via Perfect Sampling From Linear Extensions, Evan Sidrow

##### Applied Mathematics Graduate Theses & Dissertations

Bayesian networks are widely considered as powerful tools for modeling risk assessment, uncertainty, and decision making. They have been extensively employed to develop decision support systems in a variety of domains including medical diagnosis, risk assessment and management, human cognition, industrial process and procurement, pavement and bridge management, and system reliability. Bayesian networks are convenient graphical expressions for high dimensional probability distributions which are used to represent complex relationships between a large number of random variables. A Bayesian network is a directed acyclic graph consisting of nodes which represent random variables and arrows which correspond to probabilistic dependencies between them ...

#### Score Test And Likelihood Ratio Test For Zero-Inflated Binomial Distribution And Geometric Distribution, Xiaogang Dai

##### Masters Theses & Specialist Projects

The main purpose of this thesis is to compare the performance of the score test and the likelihood ratio test by computing type I errors and type II errors when the tests are applied to the geometric distribution and inflated binomial distribution. We first derive test statistics of the score test and the likelihood ratio test for both distributions. We then use the software package R to perform a simulation to study the behavior of the two tests. We derive the R codes to calculate the two types of error for each distribution. We create lots of samples to approximate ...

#### General Stochastic Integral And Itô Formula With Application To Stochastic Differential Equations And Mathematical Finance, Jiayu Zhai

##### LSU Doctoral Dissertations

A general stochastic integration theory for adapted and instantly independent stochastic processes arises when we consider anticipative stochastic differential equations. In Part I of this thesis, we conduct a deeper research on the general stochastic integral introduced by W. Ayed and H.-H. Kuo in 2008. We provide a rigorous mathematical framework for the integral in Chapter 2, and prove that the integral is well-defined. Then a general Itô formula is given. In Chapter 3, we present an intrinsic property, near-martingale property, of the general stochastic integral, and Doob-Meyer's decomposition for near-submartigales. We apply the new stochastic integration theory ...

Advances In Semi-Nonparametric Density Estimation And Shrinkage Regression, Hossein Zareamoghaddam Mar 2018

##### Electronic Thesis and Dissertation Repository

This thesis advocates the use of shrinkage and penalty techniques for estimating the parameters of a regression model that comprises both parametric and nonparametric components and develops semi-nonparametric density estimation methodologies that are applicable in a regression context.

First, a moment-based approach whereby a univariate or bivariate density function is approximated by means of a suitable initial density function that is adjusted by a linear combination of orthogonal polynomials is introduced. Such adjustments are shown to be mathematically equivalent to making use of standard polynomials in one or two variables. Once extended to apply to density estimation, in which case ...

Predicting The Next Us President By Simulating The Electoral College, Boyan Kostadinov Jan 2018

#### Predicting The Next Us President By Simulating The Electoral College, Boyan Kostadinov

##### Journal of Humanistic Mathematics

We develop a simulation model for predicting the outcome of the US Presidential election based on simulating the distribution of the Electoral College. The simulation model has two parts: (a) estimating the probabilities for a given candidate to win each state and DC, based on state polls, and (b) estimating the probability that a given candidate will win at least 270 electoral votes, and thus win the White House. All simulations are coded using the high-level, open-source programming language R. One of the goals of this paper is to promote computational thinking in any STEM field by illustrating how probabilistic ...

Sampling Techniques For Big Data Analysis In Finite Population Inference, Jae Kwang Kim, Zhonglei Wang Jan 2018

#### Sampling Techniques For Big Data Analysis In Finite Population Inference, Jae Kwang Kim, Zhonglei Wang

##### Statistics Preprints

In analyzing big data for finite population inference, it is critical to adjust for the selection bias in the big data. In this paper, we propose two methods of reducing the selection bias associated with the big data sample. The first method uses a version of inverse sampling by incorporating auxiliary infor- mation from external sources, and the second one borrows the idea of data integration by combining the big data sample with an independent proba- bility sample. Two simulation studies show that the proposed methods are unbiased and have better coverage rates than their alternatives. In addition, the proposed ...