Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

10,922 Full-Text Articles 16,084 Authors 2,919,236 Downloads 222 Institutions

All Articles in Statistics and Probability

Faceted Search

10,922 full-text articles. Page 1 of 316.

Optimal Conditional Expectation At The Video Poker Game Jacks Or Better, Stewart N. Ethier, John J. Kim, Jiyeon Lee 2019 University of Utah

Optimal Conditional Expectation At The Video Poker Game Jacks Or Better, Stewart N. Ethier, John J. Kim, Jiyeon Lee

UNLV Gaming Research & Review Journal

There are 134,459 distinct initial hands at the video poker game Jacks or Better, taking suit exchangeability into account. A computer program can determine the optimal strategy (i.e., which cards to hold) for each such hand, but a complete list of these strategies would require a book-length manuscript. Instead, a hand-rank table, which fits on a single page and reproduces the optimal strategy perfectly, was found for Jacks or Better as early as the mid 1990s. Is there a systematic way to derive such a hand-rank table? We show that there is indeed, and it involves finding the ...


A Simulation Study Of Diagnostics For Bias In Non-Probability Samples, Philip S. Boonstra, Roderick JA Little, Brady T. West, Rebecca R. Andridge, Fernanda Alvarado-Leiton 2019 University Of Michigan

A Simulation Study Of Diagnostics For Bias In Non-Probability Samples, Philip S. Boonstra, Roderick Ja Little, Brady T. West, Rebecca R. Andridge, Fernanda Alvarado-Leiton

The University of Michigan Department of Biostatistics Working Paper Series

A non-probability sampling mechanism is likely to bias estimates of parameters with respect to a target population of interest. This bias poses a unique challenge when selection is 'non-ignorable', i.e. dependent upon the unobserved outcome of interest, since it is then undetectable and thus cannot be ameliorated. We extend a simulation study by Nishimura et al. [International Statistical Review, 84, 43--62 (2016)], adding a recently published statistic, the so-called 'standardized measure of unadjusted bias', which explicitly quantifies the extent of bias under the assumption that a specified amount of non-ignorable selection exists. Our findings suggest that this new sensitivity ...


Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss 2019 The University Of Michigan

Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss

The University of Michigan Department of Biostatistics Working Paper Series

A patient's medical problem list describes his or her current health status and aids in the coordination and transfer of care between providers, among other things. Because a problem list is generated once and then subsequently modified or updated, what is not usually observable is the provider-effect. That is, to what extent does a patient's problem in the electronic medical record actually reflect a consensus communication of that patient's current health status? To that end, we report on and analyze a unique interview-based design in which multiple medical providers independently generate problem lists for each of three ...


Surprise Vs. Probability As A Metric For Proof, Edward K. Cheng, Matthew Ginther 2019 U.S. Court of Federal Claims

Surprise Vs. Probability As A Metric For Proof, Edward K. Cheng, Matthew Ginther

Edward Cheng

In this Symposium issue celebrating his career, Professor Michael Risinger in Leveraging Surprise proposes using "the fundamental emotion of surprise" as a way of measuring belief for purposes of legal proof. More specifically, Professor Risinger argues that we should not conceive of the burden of proof in terms of probabilities such as 51%, 95%, or even "beyond a reasonable doubt." Rather, the legal system should reference the threshold using "words of estimative surprise" -asking jurors how surprised they would be if the fact in question were not true. Toward this goal (and being averse to cardinality), he suggests categories such ...


The Validity Of Online Patient Ratings Of Physicians, Jennifer L. Priestley, Yiyun Zhou, Robert McGrath 2019 Kennesaw State University

The Validity Of Online Patient Ratings Of Physicians, Jennifer L. Priestley, Yiyun Zhou, Robert Mcgrath

Jennifer L. Priestley

Background: Information from ratings sites are increasingly informing patient decisions related to health care and the selection of physicians.

Objective: The current study sought to determine the validity of online patient ratings of physicians through comparison with physician peer review.

Methods: We extracted 223,715 reviews of 41,104 physicians from 10 of the largest cities in the United States, including 1142 physicians listed as “America’s Top Doctors” through physician peer review. Differences in mean online patient ratings were tested for physicians who were listed and those who were not.

Results: Overall, no differences were found between the online ...


Logistic Ensemble Models, Bob Vanderheyden, Jennifer L. Priestley 2019 Kennesaw State University

Logistic Ensemble Models, Bob Vanderheyden, Jennifer L. Priestley

Jennifer L. Priestley

Predictive models that are developed in a regulated industry or a regulated application, like determination of credit worthiness must be interpretable and “rational” (e.g., improvements in basic credit behavior must result in improved credit worthiness scores). Machine Learning technologies provide very good performance with minimal analyst intervention, so they are well suited to a high volume analytic environment but the majority are “black box” tools that provide very limited insight or interpretability into key drivers of model performance or predicted model output values. This paper presents a methodology that blends one of the most popular predictive statistical modeling methods ...


Influence Of The Event Rate On Discrimination Abilities Of Bankruptcy Prediction Models, Lili Zhang, Jennifer Priestley, Xuelei Ni 2019 Kennesaw State University

Influence Of The Event Rate On Discrimination Abilities Of Bankruptcy Prediction Models, Lili Zhang, Jennifer Priestley, Xuelei Ni

Jennifer L. Priestley

In bankruptcy prediction, the proportion of events is very low, which is often oversampled to eliminate this bias. In this paper, we study the influence of the event rate on discrimination abilities of bankruptcy prediction models. First the statistical association and significance of public records and firmographics indicators with the bankruptcy were explored. Then the event rate was oversampled from 0.12% to 10%, 20%, 30%, 40%, and 50%, respectively. Seven models were developed, including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, Support Vector Machine, Bayesian Network, and Neural Network. Under different event rates, models were comprehensively evaluated and ...


Application Of Isotonic Regression In Predicting Business Risk Scores, Linh T. Le, Jennifer L. Priestley 2019 Kennesaw State University

Application Of Isotonic Regression In Predicting Business Risk Scores, Linh T. Le, Jennifer L. Priestley

Jennifer L. Priestley

An isotonic regression model fits an isotonic function of the explanatory variables to estimate the expectation of the response variable. In other words, as the function increases, the estimated expectation of the response must be non-decreasing. With this characteristic, isotonic regression could be a suitable option to analyze and predict business risk scores. A current challenge of isotonic regression is the decrease of performance when the model is fitted in a large data set e.g. more than four or five dimensions. This paper attempts to apply isotonic regression models into prediction of business risk scores using a large data ...


A Comparison Of Decision Tree With Logistic Regression Model For Prediction Of Worst Non-Financial Payment Status In Commercial Credit, Jessica M. Rudd MPH, GStat, Jennifer L. Priestley 2019 Kennesaw State University

A Comparison Of Decision Tree With Logistic Regression Model For Prediction Of Worst Non-Financial Payment Status In Commercial Credit, Jessica M. Rudd Mph, Gstat, Jennifer L. Priestley

Jennifer L. Priestley

Credit risk prediction is an important problem in the financial services domain. While machine learning techniques such as Support Vector Machines and Neural Networks have been used for improved predictive modeling, the outcomes of such models are not readily explainable and, therefore, difficult to apply within financial regulations. In contrast, Decision Trees are easy to explain, and provide an easy to interpret visualization of model decisions. The aim of this paper is to predict worst non-financial payment status among businesses, and evaluate decision tree model performance against traditional Logistic Regression model for this task. The dataset for analysis is provided ...


Binary Classification On Past Due Of Service Accounts Using Logistic Regression And Decision Tree, Yan Wang, Jennifer L. Priestley 2019 Kennesaw State University

Binary Classification On Past Due Of Service Accounts Using Logistic Regression And Decision Tree, Yan Wang, Jennifer L. Priestley

Jennifer L. Priestley

This paper aims at predicting businesses’ past due in service accounts as well as determining the variables that impact the likelihood of repayment. Two binary classification approaches, logistic regression and the decision tree, were conducted and compared. Both approaches have very good performances with respect to the accuracy. However, the decision tree only uses 10 predictors and reaches an accuracy of 96.69% on the validation set while logistic regression includes 14 predictors and reaches an accuracy of 94.58%. Due to the large concern of false negatives in financial industry, the decision tree technique is a better option than ...


A Comparison Of Machine Learning Algorithms For Prediction Of Past Due Service In Commercial Credit, Liyuan Liu M.A, M.S., Jennifer Lewis Priestley Ph.D. 2019 Analytics and Data Science

A Comparison Of Machine Learning Algorithms For Prediction Of Past Due Service In Commercial Credit, Liyuan Liu M.A, M.S., Jennifer Lewis Priestley Ph.D.

Jennifer L. Priestley

Credit risk modeling has carried a variety of research interest in previous literature, and recent studies have shown that machine learning methods achieved better performance than conventional statistical ones. This study applies decision tree which is a robust advanced credit risk model to predict the commercial non-financial past-due problem with better critical power and accuracy. In addition, we examine the performance with logistic regression analysis, decision trees, and neural networks. The experimenting results confirm that decision trees improve upon other methods. Also, we find some interesting factors that impact the commercials’ non-financial past-due payment.


A Comparison Of Machine Learning Techniques And Logistic Regression Method For The Prediction Of Past-Due Amount, Jie Hao, Jennifer L. Priestley 2019 Kennesaw State University

A Comparison Of Machine Learning Techniques And Logistic Regression Method For The Prediction Of Past-Due Amount, Jie Hao, Jennifer L. Priestley

Jennifer L. Priestley

The aim of this paper to predict a past-due amount using traditional and machine learning techniques: Logistic Analysis, k-Nearest Neighbor and Random Forest. The dataset to be analyzed is provided by Equifax, which contains 305 categories of financial information from more than 11,787,287 unique businesses from 2006 to 2014. The big challenge is how to handle with the big and noisy real world datasets. Among the three techniques, the results show that Logistic Regression Method is the best in terms of predictive accuracy and type I errors.


An Analysis Of Accuracy Using Logistic Regression And Time Series, Edwin Baidoo, Jennifer L. Priestley 2019 Kennesaw State University

An Analysis Of Accuracy Using Logistic Regression And Time Series, Edwin Baidoo, Jennifer L. Priestley

Jennifer L. Priestley

This paper analyzes the accuracy rates for logistic regression and time series models. It also examines a relatively new performance index that takes into consideration the business assumptions of credit markets. Although prior research has focused on evaluation metrics, such as AUC and Gini index, this new measure has a more intuitive interpretation for various managers and decision makers and can be applied to both Logistic and Time Series models.


Bayesian Approximation Techniques For Scale Parameter Of Laplace Distribution, Uzma Jan, S. P. Ahmad 2019 University of Kashmir, Srinagar

Bayesian Approximation Techniques For Scale Parameter Of Laplace Distribution, Uzma Jan, S. P. Ahmad

Journal of Modern Applied Statistical Methods

The Bayesian estimation of the scale parameter of a Laplace Distribution is obtained using two approximation techniques, like Normal approximation and Tierney and Kadane (T-K) approximation, under different informative priors.


Can One Test Fit All? Responses To The Article “Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations” (Ruxton & Neuhäuser, 2018), Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen 2019 University of South Florida

Can One Test Fit All? Responses To The Article “Striving For Simple But Effective Advice For Comparing The Central Tendency Of Two Populations” (Ruxton & Neuhäuser, 2018), Diep Nguyen, Eun Sook Kim, Yi-Hsin Chen

Journal of Modern Applied Statistical Methods

Responses to suggestions made by Ruxton & Neuhäuser (2018) regarding Nguyen et al. (2016) are given.


Inferring Processes Of Coevolutionary Diversification In A Community Of Panamanian Strangler Figs And Associated Pollinating Wasps, Jordan D. Satler, Edward Allen Herre, K. Charlotte Jandér, Deren A. R. Eaton, Carlos A. Machado, Tracy A. Heath, John D. Nason 2019 Iowa State University

Inferring Processes Of Coevolutionary Diversification In A Community Of Panamanian Strangler Figs And Associated Pollinating Wasps, Jordan D. Satler, Edward Allen Herre, K. Charlotte Jandér, Deren A. R. Eaton, Carlos A. Machado, Tracy A. Heath, John D. Nason

Tracy Heath

The fig and pollinator wasp obligate mutualism is diverse (~750 described species), ecologically important, and ancient (~80-90 Ma), providing model systems for generating and testing many questions in evolution and ecology. Once thought to be a prime example of strict one-to-one cospeciation, current thinking suggests that genera of pollinator wasps coevolve with corresponding subsections of figs, but the degree to which cospeciation or other processes contributes to the association at finer scales is unclear. Here we use genome-wide sequence data from a community of Panamanian strangler figs (Ficus subgenus Urostigma, section Americana) and associated fig wasp pollinators (Pegoscapus spp.) to ...


Mixtures Of Self-Modelling Regressions, Rhonda D. Szczesniak, Kert Viele, Robin L. Cooper 2019 Cincinnati Children's Hospital Medical Center

Mixtures Of Self-Modelling Regressions, Rhonda D. Szczesniak, Kert Viele, Robin L. Cooper

Robin L. Cooper

A shape invariant model for functions f1,...,fn specifies that each individual function fi can be related to a common shape function g through the relation fi(x) = aig(cix + di) + bi. We consider a flexible mixture model that allows multiple shape functions g1,...,gK, where each fi is a shape invariant transformation of one of those gK. We derive an MCMC algorithm for fitting the model using Bayesian Adaptive Regression Splines (BARS), propose a strategy to improve its mixing properties and utilize existing model selection ...


On The Conditional And Unconditional Type I Error Rates And Power Of Tests In Linear Models With Heteroscedastic Errors, Patrick J. Rosopa, Alice M. Brawley, Theresa P. Atkinson, Stephen A. Robertson 2019 Clemson University

On The Conditional And Unconditional Type I Error Rates And Power Of Tests In Linear Models With Heteroscedastic Errors, Patrick J. Rosopa, Alice M. Brawley, Theresa P. Atkinson, Stephen A. Robertson

Journal of Modern Applied Statistical Methods

Preliminary tests for homoscedasticity may be unnecessary in general linear models. Based on Monte Carlo simulations, results suggest that when testing for differences between independent slopes, the unconditional use of weighted least squares regression and HC4 regression performed the best across a wide range of conditions.


Φ-Divergence Loss-Based Artificial Neural Network, R. L. Salamwade, D. M. Sakate, S. K. Mathur 2019 Shivaji University, Kolhapur, India

Φ-Divergence Loss-Based Artificial Neural Network, R. L. Salamwade, D. M. Sakate, S. K. Mathur

Journal of Modern Applied Statistical Methods

Artificial Neural Networks (ANNs) can fit non-linear functions and recognize patterns better than several standard techniques. Performance of ANNs is measured by using loss functions. Phi-divergence estimator is generalization of maximum likelihood estimator and it possesses all its properties. A neural network is proposed which is trained using phi-divergence loss.


A Robust Nonparametric Measure Of Effect Size Based On An Analog Of Cohen's D, Plus Inferences About The Median Of The Typical Difference, Rand Wilcox 2019 University of Southern California

A Robust Nonparametric Measure Of Effect Size Based On An Analog Of Cohen's D, Plus Inferences About The Median Of The Typical Difference, Rand Wilcox

Journal of Modern Applied Statistical Methods

The paper describes a nonparametric analog of Cohen's d, Q. It is established that a confidence interval for Q can be computed via a method for computing a confidence interval for the median of D = X1X2, which in turn is related to making inferences about P(X1 < X2).


Digital Commons powered by bepress