Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

11,011 Full-Text Articles 16,237 Authors 3,366,482 Downloads 226 Institutions

All Articles in Statistics and Probability

Faceted Search

11,011 full-text articles. Page 2 of 320.

Efficient Class Of Estimators For Finite Population Mean Using Auxiliary Information In Two-Occasion Successive Sampling, G. N. Singh, Mohd Khalid 2019 Indian Institute of Technology (ISM)

Efficient Class Of Estimators For Finite Population Mean Using Auxiliary Information In Two-Occasion Successive Sampling, G. N. Singh, Mohd Khalid

Journal of Modern Applied Statistical Methods

In the case of sampling on two occasions, a class of estimators is considered which uses information on the first occasion as well as the second occasion in order to estimate the population means on the current (second) occasion. The usefulness of auxiliary information in enhancing the efficiency of this estimation is examined through the class of proposed estimators. Some properties of the class of estimators and a strategy of optimum replacement are discussed. The proposed class of estimators were empirically compared with the sample mean estimator in the case of no matching. The established optimum estimator, which is a ...


Fixing Metric Fixation: A Review Of The Tyranny Of Metrics, Donald Roth 2019 Dordt College

Fixing Metric Fixation: A Review Of The Tyranny Of Metrics, Donald Roth

Faculty Work Comprehensive List

"We should heed the author’s warning that transparent metrics and scorecards are rarely going to be effective substitutes for institutional trust."

Posting about the book The Tyranny of Metrics from In All Things - an online journal for critical reflection on faith, culture, art, and every ordinary-yet-graced square inch of God’s creation.

https://inallthings.org/fixing-metric-fixation-a-review-of-the-tyranny-of-metrics/


Tobacco Smoking And Dementia In A Kentucky Cohort: A Competing Risk Analysis, Erin L. Abner, Peter T. Nelson, Gregory A. Jicha, Gregory E. Cooper, David W. Fardo, Frederick A. Schmitt, Richard J. Kryscio 2019 University of Kentucky

Tobacco Smoking And Dementia In A Kentucky Cohort: A Competing Risk Analysis, Erin L. Abner, Peter T. Nelson, Gregory A. Jicha, Gregory E. Cooper, David W. Fardo, Frederick A. Schmitt, Richard J. Kryscio

Epidemiology Faculty Publications

Tobacco smoking was examined as a risk for dementia and neuropathological burden in 531 initially cognitively normal older adults followed longitudinally at the University of Kentucky’s Alzheimer’s Disease Center. The cohort was followed for an average of 11.5 years; 111 (20.9%) participants were diagnosed with dementia, while 242 (45.6%) died without dementia. At baseline, 49 (9.2%) participants reported current smoking (median pack-years = 47.3) and 231 (43.5%) former smoking (median pack-years = 24.5). The hazard ratio (HR) for dementia for former smokers versus never smokers based on the Cox model was 1.64 ...


Review Of Developing Quantitative Literacy Skills In History And The Social Sciences: A Web-Based Common Core Approach By Kathleen W. Craver, Victor J. Ricchezza, H L. Vacher 2019 University of South Florida

Review Of Developing Quantitative Literacy Skills In History And The Social Sciences: A Web-Based Common Core Approach By Kathleen W. Craver, Victor J. Ricchezza, H L. Vacher

Victor Ricchezza

Kathleen W. Craver. Developing Quantitative Literacy Skills in History and Social Sciences: A Web-Based Common Core Standards Approach (Lantham MD: Rowman & Littlefield Publishing Group, Inc., 2014). 191 pp.
ISBN 978-1-4758-1050-9 (cloth); ISBN …-1051-6 (pbk); ISBN…-1052-3 (electronic).

This book could be a breakthrough for teachers in the trenches who are interested in or need to know about quantitative literacy (QL). It is a resource providing 85 topical pieces, averaging 1.5 pages, in which a featured Web site is presented, described, and accompanied by 2-4 critical-thinking questions purposefully drawing on data from the Web site. The featured Web sites range from primary documents (e.g., All about California and the ...


Jmasm 51: Bayesian Reliability Analysis Of Binomial Model – Application To Success/Failure Data, M. Tanwir Akhtar, Athar Ali Khan 2019 Saudi Electronic University, Jeddah, Saudi Arabia

Jmasm 51: Bayesian Reliability Analysis Of Binomial Model – Application To Success/Failure Data, M. Tanwir Akhtar, Athar Ali Khan

Journal of Modern Applied Statistical Methods

Reliability data are generated in the form of success/failure. An attempt was made to model such type of data using binomial distribution in the Bayesian paradigm. For fitting the Bayesian model both analytic and simulation techniques are used. Laplace approximation was implemented for approximating posterior densities of the model parameters. Parallel simulation tools were implemented with an extensive use of R and JAGS. R and JAGS code are developed and provided. Real data sets are used for the purpose of illustration.


Data And Metrics: Do We Need Them? What Can They Tell Us? What Can't They?, Nathan L. Tintle 2019 Dordt College

Data And Metrics: Do We Need Them? What Can They Tell Us? What Can't They?, Nathan L. Tintle

Faculty Work Comprehensive List

"In our increasingly data-centric world, how do we think about data? How should we think about data?"

Posting about ­­­­­­­­using data to make informed decisions from In All Things - an online journal for critical reflection on faith, culture, art, and every ordinary-yet-graced square inch of God’s creation.

https://inallthings.org/data-and-metrics-do-we-need-them-what-can-they-tell-us-what-cant-they/


A Random Forests Approach To Assess Determinants Of Central Bank Independence, Maddalena Cavicchioli, Angeliki Papana, Ariadni Papana Dagiasis, Barbara Pistoresi 2019 University of Verona

A Random Forests Approach To Assess Determinants Of Central Bank Independence, Maddalena Cavicchioli, Angeliki Papana, Ariadni Papana Dagiasis, Barbara Pistoresi

Journal of Modern Applied Statistical Methods

A non-parametric efficient statistical method, Random Forests, is implemented for the selection of the determinants of Central Bank Independence (CBI) among a large database of economic, political, and institutional variables for OECD countries. It permits ranking all the determinants based on their importance in respect to the CBI and does not impose a priori assumptions on potential nonlinear relationships in the data. Collinearity issues are resolved, because correlated variables can be simultaneously considered.


Maximum Likelihood Estimation For The Generalized Pareto Distribution And Goodness-Of-Fit Test With Censored Data, Minh H. Pham, Chris Tsokos, Bong-Jin Choi 2019 University of South Florida

Maximum Likelihood Estimation For The Generalized Pareto Distribution And Goodness-Of-Fit Test With Censored Data, Minh H. Pham, Chris Tsokos, Bong-Jin Choi

Journal of Modern Applied Statistical Methods

The generalized Pareto distribution (GPD) is a flexible parametric model commonly used in financial modeling. Maximum likelihood estimation (MLE) of the GPD was proposed by Grimshaw (1993). Maximum likelihood estimation of the GPD for censored data is developed, and a goodness-of-fit test is constructed to verify an MLE algorithm in R and to support the model-validation step. The algorithms were composed in R. Grimshaw’s algorithm outperforms functions available in the R package ‘gPdtest’. A simulation study showed the MLE method for censored data and the goodness-of-fit test are both reliable.


Optimal Conditional Expectation At The Video Poker Game Jacks Or Better, Stewart N. Ethier, John J. Kim, Jiyeon Lee 2019 University of Utah

Optimal Conditional Expectation At The Video Poker Game Jacks Or Better, Stewart N. Ethier, John J. Kim, Jiyeon Lee

UNLV Gaming Research & Review Journal

There are 134,459 distinct initial hands at the video poker game Jacks or Better, taking suit exchangeability into account. A computer program can determine the optimal strategy (i.e., which cards to hold) for each such hand, but a complete list of these strategies would require a book-length manuscript. Instead, a hand-rank table, which fits on a single page and reproduces the optimal strategy perfectly, was found for Jacks or Better as early as the mid 1990s. Is there a systematic way to derive such a hand-rank table? We show that there is indeed, and it involves finding the ...


A Simulation Study Of Diagnostics For Bias In Non-Probability Samples, Philip S. Boonstra, Roderick JA Little, Brady T. West, Rebecca R. Andridge, Fernanda Alvarado-Leiton 2019 University Of Michigan

A Simulation Study Of Diagnostics For Bias In Non-Probability Samples, Philip S. Boonstra, Roderick Ja Little, Brady T. West, Rebecca R. Andridge, Fernanda Alvarado-Leiton

The University of Michigan Department of Biostatistics Working Paper Series

A non-probability sampling mechanism is likely to bias estimates of parameters with respect to a target population of interest. This bias poses a unique challenge when selection is 'non-ignorable', i.e. dependent upon the unobserved outcome of interest, since it is then undetectable and thus cannot be ameliorated. We extend a simulation study by Nishimura et al. [International Statistical Review, 84, 43--62 (2016)], adding a recently published statistic, the so-called 'standardized measure of unadjusted bias', which explicitly quantifies the extent of bias under the assumption that a specified amount of non-ignorable selection exists. Our findings suggest that this new sensitivity ...


Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss 2019 The University Of Michigan

Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss

The University of Michigan Department of Biostatistics Working Paper Series

A patient's medical problem list describes his or her current health status and aids in the coordination and transfer of care between providers, among other things. Because a problem list is generated once and then subsequently modified or updated, what is not usually observable is the provider-effect. That is, to what extent does a patient's problem in the electronic medical record actually reflect a consensus communication of that patient's current health status? To that end, we report on and analyze a unique interview-based design in which multiple medical providers independently generate problem lists for each of three ...


Surprise Vs. Probability As A Metric For Proof, Edward K. Cheng, Matthew Ginther 2019 U.S. Court of Federal Claims

Surprise Vs. Probability As A Metric For Proof, Edward K. Cheng, Matthew Ginther

Edward Cheng

In this Symposium issue celebrating his career, Professor Michael Risinger in Leveraging Surprise proposes using "the fundamental emotion of surprise" as a way of measuring belief for purposes of legal proof. More specifically, Professor Risinger argues that we should not conceive of the burden of proof in terms of probabilities such as 51%, 95%, or even "beyond a reasonable doubt." Rather, the legal system should reference the threshold using "words of estimative surprise" -asking jurors how surprised they would be if the fact in question were not true. Toward this goal (and being averse to cardinality), he suggests categories such ...


The Validity Of Online Patient Ratings Of Physicians, Jennifer L. Priestley, Yiyun Zhou, Robert McGrath 2019 Kennesaw State University

The Validity Of Online Patient Ratings Of Physicians, Jennifer L. Priestley, Yiyun Zhou, Robert Mcgrath

Jennifer L. Priestley

Background: Information from ratings sites are increasingly informing patient decisions related to health care and the selection of physicians.

Objective: The current study sought to determine the validity of online patient ratings of physicians through comparison with physician peer review.

Methods: We extracted 223,715 reviews of 41,104 physicians from 10 of the largest cities in the United States, including 1142 physicians listed as “America’s Top Doctors” through physician peer review. Differences in mean online patient ratings were tested for physicians who were listed and those who were not.

Results: Overall, no differences were found between the online ...


Logistic Ensemble Models, Bob Vanderheyden, Jennifer L. Priestley 2019 Kennesaw State University

Logistic Ensemble Models, Bob Vanderheyden, Jennifer L. Priestley

Jennifer L. Priestley

Predictive models that are developed in a regulated industry or a regulated application, like determination of credit worthiness must be interpretable and “rational” (e.g., improvements in basic credit behavior must result in improved credit worthiness scores). Machine Learning technologies provide very good performance with minimal analyst intervention, so they are well suited to a high volume analytic environment but the majority are “black box” tools that provide very limited insight or interpretability into key drivers of model performance or predicted model output values. This paper presents a methodology that blends one of the most popular predictive statistical modeling methods ...


Influence Of The Event Rate On Discrimination Abilities Of Bankruptcy Prediction Models, Lili Zhang, Jennifer Priestley, Xuelei Ni 2019 Kennesaw State University

Influence Of The Event Rate On Discrimination Abilities Of Bankruptcy Prediction Models, Lili Zhang, Jennifer Priestley, Xuelei Ni

Jennifer L. Priestley

In bankruptcy prediction, the proportion of events is very low, which is often oversampled to eliminate this bias. In this paper, we study the influence of the event rate on discrimination abilities of bankruptcy prediction models. First the statistical association and significance of public records and firmographics indicators with the bankruptcy were explored. Then the event rate was oversampled from 0.12% to 10%, 20%, 30%, 40%, and 50%, respectively. Seven models were developed, including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, Support Vector Machine, Bayesian Network, and Neural Network. Under different event rates, models were comprehensively evaluated and ...


Application Of Isotonic Regression In Predicting Business Risk Scores, Linh T. Le, Jennifer L. Priestley 2019 Kennesaw State University

Application Of Isotonic Regression In Predicting Business Risk Scores, Linh T. Le, Jennifer L. Priestley

Jennifer L. Priestley

An isotonic regression model fits an isotonic function of the explanatory variables to estimate the expectation of the response variable. In other words, as the function increases, the estimated expectation of the response must be non-decreasing. With this characteristic, isotonic regression could be a suitable option to analyze and predict business risk scores. A current challenge of isotonic regression is the decrease of performance when the model is fitted in a large data set e.g. more than four or five dimensions. This paper attempts to apply isotonic regression models into prediction of business risk scores using a large data ...


A Comparison Of Decision Tree With Logistic Regression Model For Prediction Of Worst Non-Financial Payment Status In Commercial Credit, Jessica M. Rudd MPH, GStat, Jennifer L. Priestley 2019 Kennesaw State University

A Comparison Of Decision Tree With Logistic Regression Model For Prediction Of Worst Non-Financial Payment Status In Commercial Credit, Jessica M. Rudd Mph, Gstat, Jennifer L. Priestley

Jennifer L. Priestley

Credit risk prediction is an important problem in the financial services domain. While machine learning techniques such as Support Vector Machines and Neural Networks have been used for improved predictive modeling, the outcomes of such models are not readily explainable and, therefore, difficult to apply within financial regulations. In contrast, Decision Trees are easy to explain, and provide an easy to interpret visualization of model decisions. The aim of this paper is to predict worst non-financial payment status among businesses, and evaluate decision tree model performance against traditional Logistic Regression model for this task. The dataset for analysis is provided ...


Binary Classification On Past Due Of Service Accounts Using Logistic Regression And Decision Tree, Yan Wang, Jennifer L. Priestley 2019 Kennesaw State University

Binary Classification On Past Due Of Service Accounts Using Logistic Regression And Decision Tree, Yan Wang, Jennifer L. Priestley

Jennifer L. Priestley

This paper aims at predicting businesses’ past due in service accounts as well as determining the variables that impact the likelihood of repayment. Two binary classification approaches, logistic regression and the decision tree, were conducted and compared. Both approaches have very good performances with respect to the accuracy. However, the decision tree only uses 10 predictors and reaches an accuracy of 96.69% on the validation set while logistic regression includes 14 predictors and reaches an accuracy of 94.58%. Due to the large concern of false negatives in financial industry, the decision tree technique is a better option than ...


A Comparison Of Machine Learning Algorithms For Prediction Of Past Due Service In Commercial Credit, Liyuan Liu M.A, M.S., Jennifer Lewis Priestley Ph.D. 2019 Analytics and Data Science

A Comparison Of Machine Learning Algorithms For Prediction Of Past Due Service In Commercial Credit, Liyuan Liu M.A, M.S., Jennifer Lewis Priestley Ph.D.

Jennifer L. Priestley

Credit risk modeling has carried a variety of research interest in previous literature, and recent studies have shown that machine learning methods achieved better performance than conventional statistical ones. This study applies decision tree which is a robust advanced credit risk model to predict the commercial non-financial past-due problem with better critical power and accuracy. In addition, we examine the performance with logistic regression analysis, decision trees, and neural networks. The experimenting results confirm that decision trees improve upon other methods. Also, we find some interesting factors that impact the commercials’ non-financial past-due payment.


A Comparison Of Machine Learning Techniques And Logistic Regression Method For The Prediction Of Past-Due Amount, Jie Hao, Jennifer L. Priestley 2019 Kennesaw State University

A Comparison Of Machine Learning Techniques And Logistic Regression Method For The Prediction Of Past-Due Amount, Jie Hao, Jennifer L. Priestley

Jennifer L. Priestley

The aim of this paper to predict a past-due amount using traditional and machine learning techniques: Logistic Analysis, k-Nearest Neighbor and Random Forest. The dataset to be analyzed is provided by Equifax, which contains 305 categories of financial information from more than 11,787,287 unique businesses from 2006 to 2014. The big challenge is how to handle with the big and noisy real world datasets. Among the three techniques, the results show that Logistic Regression Method is the best in terms of predictive accuracy and type I errors.


Digital Commons powered by bepress