Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

11,021 Full-Text Articles 16,264 Authors 3,366,482 Downloads 226 Institutions

All Articles in Statistics and Probability

Faceted Search

11,021 full-text articles. Page 1 of 321.

Characterizing The Permanence And Stationary Distribution For A Family Of Malaria Stochastic Models, Divine Wanduku 2019 Virginia Commonwealth University

Characterizing The Permanence And Stationary Distribution For A Family Of Malaria Stochastic Models, Divine Wanduku

Biology and Medicine Through Mathematics Conference

No abstract provided.


Measuring Clinical Weight Loss In Young Children With Severe Obesity: Comparison Of Outcomes Using Zbmi, Modified Zbmi, And Percent Of 95th Percentile, Carolyn Bates 2019 Children's Mercy Kansas City

Measuring Clinical Weight Loss In Young Children With Severe Obesity: Comparison Of Outcomes Using Zbmi, Modified Zbmi, And Percent Of 95th Percentile, Carolyn Bates

Research Days

No abstract provided.


Predictive Performance Of Existing Population Pharmacokinetic Models Of Tacrolimus In Pediatric Kidney Transplant Recipients, Alenka Chapron 2019 Children's Mercy Hospital, Kansas City, MO

Predictive Performance Of Existing Population Pharmacokinetic Models Of Tacrolimus In Pediatric Kidney Transplant Recipients, Alenka Chapron

Research Days

No abstract provided.


Quantifying Sleep Architecture For Pediatric Hypersomnia Conditions, Alicia K. Colclasure 2019 Colorado School of Mines

Quantifying Sleep Architecture For Pediatric Hypersomnia Conditions, Alicia K. Colclasure

Biology and Medicine Through Mathematics Conference

No abstract provided.


Prospective Evaluation Of A Population Pharmacokinetic Model Of Pantoprazole For Obese Children, Alenka Chapron 2019 Children's Mercy Hospital, Kansas City, MO

Prospective Evaluation Of A Population Pharmacokinetic Model Of Pantoprazole For Obese Children, Alenka Chapron

Research Days

No abstract provided.


Characterizing The Tails Of Degree Distributions In Real-World Networks, Anna Broido 2019 University of Colorado, Boulder

Characterizing The Tails Of Degree Distributions In Real-World Networks, Anna Broido

Applied Mathematics Graduate Theses & Dissertations

This is a thesis about how to characterize the statistical structure of the tails of degree distributions of real-world networks. The primary contribution is a statistical test of the prevalence of scale-free structure in real-world networks. A central claim in modern network science is that real-world networks are typically "scale free," meaning that the fraction of nodes with degree k follows a power law, decaying like k-a, often with 2 < a< 3. However, empirical evidence for this belief derives from a relatively small number of real-world networks. In the first section, we test the universality of scale-free structure by applying state-of-the-art statistical tools to a large corpus of nearly 1000 network data sets drawn from social, biological, technological, and informational sources. We fit the power-law model to each degree distribution, test its statistical plausibility, and compare it via a likelihood ratio test to alternative, non-scale-free models, e.g., the log-normal. Across domains, we find that scale-free networks are rare, with only 4% exhibiting the strongest-possible evidence of scale-free structure and 52% exhibiting the weakest-possible evidence. Furthermore, evidence of scale-free structure is not uniformly distributed across sources: social networks are at best weakly scale free, while a handful of technological and biological networks can be called strongly scale free. These results undermine the universality of scale-free networks and reveal that real-world networks exhibit a rich structural diversity that will likely require new ideas and mechanisms to explain. A core methodological component of addressing the ubiquity of scale-free structure in real-world networks is an ability to fit a power law to the degree distribution. In the second section, we numerically evaluate and compare, using both synthetic data with known structure and real-world data with unknown structure, two statistically principled methods for estimating the tail parameters for power-law distributions, showing that in practice, a method based on extreme value theory and a sophisticated bootstrap and the more commonly used method based an empirical minimization approach exhibit similar accuracy.


Time Series Analysis: Forecasting Treasury Bill Interest Rates, Nadine P. Innes 2019 Murray State University

Time Series Analysis: Forecasting Treasury Bill Interest Rates, Nadine P. Innes

Honors College Theses

A Treasury Bill is a short-term investment typically with a maturity date of 12 months or less that is backed by the Treasury Department of the United States government. Rates of return for Treasury Bills are constantly changing over time due to the constant change of demand from borrowers and supply from lenders. This study seeks to forecast treasury bill rates that mature in 3 months. Since actuaries employ their knowledge of mathematics and statistical methods to analyze the likelihood of future events and their possible financial repercussions, having a projection of future treasury bill rates can provide guidance to ...


Comparison Of Imputation Methods For Mixed Data Missing At Random, Kaitlyn Heidt 2019 East Tennessee State University

Comparison Of Imputation Methods For Mixed Data Missing At Random, Kaitlyn Heidt

Electronic Theses and Dissertations

A statistician's job is to produce statistical models. When these models are precise and unbiased, we can relate them to new data appropriately. However, when data sets have missing values, assumptions to statistical methods are violated and produce biased results. The statistician's objective is to implement methods that produce unbiased and accurate results. Research in missing data is becoming popular as modern methods that produce unbiased and accurate results are emerging, such as MICE in R, a statistical software. Using real data, we compare four common imputation methods, in the MICE package in R, at different levels of ...


Feasibility Of Multi-Year Forecast For The Colorado River Water Supply: Time Series Modeling, Brian Plucinski 2019 Utah State University

Feasibility Of Multi-Year Forecast For The Colorado River Water Supply: Time Series Modeling, Brian Plucinski

All Graduate Plan B and other Reports

The Colorado River is one of the largest resources for water in the United States, as well as being an important asset to the economy. Previous studies have shown a connection between the Great Salt Lake and the Colorado River. This study used time series analysis to build models to predict the water supply of the Colorado River ten years out. These models used data from the Colorado River in addition to Great Salt Lake water elevation. Several models suggest a decline in water supply from 2013 – 2020, before starting to increase. These predictions differ from predictions published by a ...


Predictive Distributions Via Filtered Historical Simulation For Financial Risk Management, Tyson Clark 2019 Utah State University

Predictive Distributions Via Filtered Historical Simulation For Financial Risk Management, Tyson Clark

All Graduate Plan B and other Reports

Filtered historical simulation with an underlying GARCH process can be used as a valuable tool in VaR analysis, as it derives risk estimates that are sensitive to the distributional properties of the historical data of the produced predictive density. I examine the applications to risk analysis that filtered historical simulation can provide, as well as an interpretation of the predictive density as a poor man’s Bayesian posterior distribution. The predictive density allows us to make associated probabilistic statements regarding the results for VaR analysis, giving greater measurement of risk and the ability to maintain the optimal level of risk ...


Deep Neural Network Architectures For Music Genre Classification, Kai Middlebrook, Shyam Sudhakaran, Kunal Sonar, David Guy Brizan 2019 University of San Francisco

Deep Neural Network Architectures For Music Genre Classification, Kai Middlebrook, Shyam Sudhakaran, Kunal Sonar, David Guy Brizan

Creative Activity and Research Day - CARD

With the recent advancements in technology, many tasks in fields such as computer vision, natural language processing, and signal processing have been solved using deep learning architectures. In the audio domain, these architectures have been used to learn musical features of songs to predict: moods, genres, and instruments. In the case of genre classification, deep learning models were applied to popular datasets--which are explicitly chosen to represent their genres--and achieved state-of-the-art results. However, these results have not been reproduced on less refined datasets. To this end, we introduce an un-curated dataset which contains genre labels and 30-second audio previews for ...


A Self-Contained Course In The Mathematical Theory Of Statistics For Scientists & Engineers With An Emphasis On Predictive Regression Modeling & Financial Applications., Tim Smith 2019 Embry-Riddle Aeronautical University

A Self-Contained Course In The Mathematical Theory Of Statistics For Scientists & Engineers With An Emphasis On Predictive Regression Modeling & Financial Applications., Tim Smith

Timothy Smith

Preface & Acknowledgments

This textbook is designed for a higher level undergraduate, perhaps even first year graduate, course for engineering or science students who are interested to gain knowledge of using data analysis to make predictive models. While there is no statistical perquisite knowledge required to read this book, due to the fact that the study is designed for the reader to truly understand the underlying theory rather than just learn how to read computer output, it would be best read with some familiarity of elementary statistics. The book is self-contained and the only true perquisite knowledge is a solid understanding of university level calculus, which of course it is expected that any engineering or science student will have mastery off. The intention for this textbook is for an elective type course; however, the foundations are laid here for further mathematical study and this text could well serve as a transition for an interested student with little to no prior knowledge to then go on to study in the popular fields of data scientist, big data or whatever the buzz words of the day may call it. A natural next read would be something equivalent to the popular texts “an introduction to statistical learning” or the “the elements of statistical learning,” by Hastie & Friedman et al.

The author is very grateful for the opportunity to have implemented and taught the MA ...


Combining Survey And Non-Survey Data For Improved Sub-Area Prediction Using A Multi-Level Model, Jae Kwang Kim, Zhonglei Wang, Zhengyuan Zhu, Nathan B. Cruze 2019 Iowa State University

Combining Survey And Non-Survey Data For Improved Sub-Area Prediction Using A Multi-Level Model, Jae Kwang Kim, Zhonglei Wang, Zhengyuan Zhu, Nathan B. Cruze

Zhengyuan Zhu

Combining information from different sources is an important practical problem in survey sampling. Using a hierarchical area-level model, we establish a framework to integrate auxiliary information to improve state-level area estimates. The best predictors are obtained by the conditional expectations of latent variables given observations, and an estimate of the mean squared prediction error is discussed. Sponsored by the National Agricultural Statistics Service of the US Department of Agriculture, the proposed model is applied to the planted crop acreage estimation problem by combining information from three sources, including the June Area Survey obtained by a probability-based sampling of lands, administrative ...


The Andersen Likelihood Ratio Test With A Random Split Criterion Lacks Power, Georg Krammer 2019 University College of Teacher Education Styria

The Andersen Likelihood Ratio Test With A Random Split Criterion Lacks Power, Georg Krammer

Journal of Modern Applied Statistical Methods

The Andersen LRT uses sample characteristics as split criteria to evaluate Rasch model fit, or theory driven hypothesis testing for a test. The power and Type I error of a random split criterion was evaluated with a simulation study. Results consistently show a random split criterion lacks power.


Weighted Version Of Generalized Inverse Weibull Distribution, Sofi Mudiasir, S. P. Ahmad 2019 University of Kashmir, Srinagar, India

Weighted Version Of Generalized Inverse Weibull Distribution, Sofi Mudiasir, S. P. Ahmad

Journal of Modern Applied Statistical Methods

Weighted distributions are used in many fields, such as medicine, ecology, and reliability. A weighted version of the generalized inverse Weibull distribution, known as weighted generalized inverse Weibull distribution (WGIWD), is proposed. Basic properties including mode, moments, moment generating function, skewness, kurtosis, and Shannon’s entropy are studied. The usefulness of the new model was demonstrated by applying it to a real-life data set. The WGIWD fits better than its submodels, such as length biased generalized inverse Weibull (LGIW), generalized inverse Weibull (GIW), inverse Weibull (IW) and inverse exponential (IE) distributions.


The Importance Of Geographic And Biological Variables In Predicting The Naturalization Of Non-Native Woody Plants In The Upper Midwest, Mark P. Widrlechner, Emily J. Kapler, Philip M. Dixon, Janette R. Thompson 2019 Iowa State University

The Importance Of Geographic And Biological Variables In Predicting The Naturalization Of Non-Native Woody Plants In The Upper Midwest, Mark P. Widrlechner, Emily J. Kapler, Philip M. Dixon, Janette R. Thompson

Janette R. Thompson

The selection, introduction, and cultivation of non-native woody plants beyond their native ranges can have great benefits, but also unintended consequences. Among these consequences is the tendency for some species to naturalize and become invasive pests in new environments to which they were introduced. In lieu of lengthy and costly field trials, risk-assessment models can be used to predict the likelihood of naturalization. We compared the relative performance of five established risk-assessment models on species datasets from two previously untested areas: southern Minnesota and northern Missouri. Model classification rates ranged from 64.2 to 90.5%, biologically significant errors ranged ...


The Effectiveness Of A Single Regional Model In Predicting Non-Native Woody Plant Naturalization In Five Areas Within The Upper Midwest (United States), Philip M. Dixon, Janette R. Thompson, Mark P. Widrlechner, Emily J. Kapler 2019 Iowa State University

The Effectiveness Of A Single Regional Model In Predicting Non-Native Woody Plant Naturalization In Five Areas Within The Upper Midwest (United States), Philip M. Dixon, Janette R. Thompson, Mark P. Widrlechner, Emily J. Kapler

Janette R. Thompson

Numerous predictive models have been developed to determine the likelihood that non-native plants will escape from cultivation and potentially become invasive. Given the substantial biological and economic costs that can result from the introduction of a new invasive plant and the unending pressures of world trade and transport, the creation and implementation of effective predictive models are becoming increasingly important. One key question in the development of such models focuses on the geographic scope at which models can best be developed and applied. We have developed models to predict woody-plant naturalization in five local areas within the Upper Midwest (United ...


Calibration Of Measurements, Edward Kroc, Bruno D. Zumbo 2019 University of British Columbia

Calibration Of Measurements, Edward Kroc, Bruno D. Zumbo

Journal of Modern Applied Statistical Methods

Traditional notions of measurement error typically rely on a strong mean-zero assumption on the expectation of the errors conditional on an unobservable “true score” (classical measurement error) or on the data themselves (Berkson measurement error). Weakly calibrated measurements for an unobservable true quantity are defined based on a weaker mean-zero assumption, giving rise to a measurement model of differential error. Applications show it retains many attractive features of estimation and inference when performing a naive data analysis (i.e. when performing an analysis on the error-prone measurements themselves), and other interesting properties not present in the classical or Berkson cases ...


Predicting Live Weight Of Rural African Goats Using Body Measurements, Josue Chinchilla-Vargas, M Jennifer Woodward-Greene, Curtis P. Van Tassell, Clet Wandui Masiga, Max F. Rothschild 2019 Iowa State University

Predicting Live Weight Of Rural African Goats Using Body Measurements, Josue Chinchilla-Vargas, M Jennifer Woodward-Greene, Curtis P. Van Tassell, Clet Wandui Masiga, Max F. Rothschild

Max Rothschild

The goal of the current study was to develop simple regression-based equations that allow small-scale producers to use simple body measurements to accurately predict live weight of typical African goats. The data used in this study were recorded in five African countries, and was composed of 814 individuals of 40 indigenous breeds or populations and crosses that included 158 males and 656 females. Records included the live weight measured with a hanging scale, linear body measurements, country, breed, owner, and age. Country, breed, age, chest girth, height at withers, body length, and shoulder width had large effects (p76 cm, the ...


Metajelo: A Metadata Package For Journals To Support External Linked Objects, Carl Lagoze, Lars Vilhuber 2019 University of Michigan

Metajelo: A Metadata Package For Journals To Support External Linked Objects, Carl Lagoze, Lars Vilhuber

Labor Dynamics Institute

We propose a metadata package that is intended to provide academic journals with a lightweight means of registering, at the time of publication, the existence and disposition of supplementary materials. Information about the supplementary materials is, in most cases, critical for the reproducibility and replicability of scholarly results. In many instances, these materials are curated by a third party, which may or may not follow developing standards for the identification and description of those materials. As such, the vocabulary described here complements existing initiatives that specify vocabularies to describe the supplementary materials or the repositories and archives in which they ...


Digital Commons powered by bepress