Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

12,353 Full-Text Articles 19,062 Authors 3,366,482 Downloads 247 Institutions

All Articles in Statistics and Probability

Faceted Search

12,353 full-text articles. Page 6 of 379.

First-Year Computer Science Students: Pathways And Perceptions In Introductory Computer Science Courses, Christina A. LeBlanc 2020 University of Maine

First-Year Computer Science Students: Pathways And Perceptions In Introductory Computer Science Courses, Christina A. Leblanc

Electronic Theses and Dissertations

This study examined student perceptions and experiences of an introductory Computer Science course at the University of Maine; COS 125: Introduction to Problem Solving Using Computer Programs. It also explored the pathways that students pursue after taking COS 125, depending on their success in the course, and their motivation to persist. Through characterizing student populations and their performance in their first semester in the Computer Science program, they can be placed into one of three categories that explain their path; a “continuer” (passed COS 125 and decided to stay in the major), a “persister” (did not pass COS 125 and ...


Modeling Species Distribution And Habitat Suitability Of American Ginseng (Panax Quinquefolius) In Virginia, Jacob Peters 2020 James Madison University

Modeling Species Distribution And Habitat Suitability Of American Ginseng (Panax Quinquefolius) In Virginia, Jacob Peters

Masters Theses, 2020-current

American ginseng (Panax quinquefolius) is a well-known and sought-after medicinal plant native to North America that is facing increased threat of extinction due to overharvesting, herbivory, and habitat loss. Species distribution and habitat suitability models may be valuable to landowners interested in sustainable harvest or to institutions interested in the conservation and restoration of the species. With unequal sampling efforts across a region of interest, it is likely that some locations with appropriate habitat may be misrepresented in model predictions. This study refined a state-derived species distribution model for ginseng through increased sampling effort across the Cumberland Plateau of Virginia ...


The Opioid Crisis And Life Expectancy In The U.S., Gabriel Lozano 2020 University of Pennsylvania

The Opioid Crisis And Life Expectancy In The U.S., Gabriel Lozano

Joseph Wharton Scholars

Since the 1990s, when opioids started to be grossly over-prescribed, almost 450,000 people have died as a direct result of opioid abuse in the United States. This study analyzes the role the opioid crisis has in the decreasing life expectancy in the United States, a troubling trend given the enormous and growing national healthcare expenditure. Employing a multiple decrement model and national life expectancy tables, this paper removes the opioid-related mortality and develops a new life expectancy model. The actuarial analysis of the observed and estimated life expectancies reveals the impact of opioid-related deaths: overall, U.S persons are ...


Using Stability To Select A Shrinkage Method, Dean Dustin 2020 University of Nebraska - Lincoln

Using Stability To Select A Shrinkage Method, Dean Dustin

Dissertations and Theses in Statistics

Shrinkage methods are estimation techniques based on optimizing expressions to find which variables to include in an analysis, typically a linear regression. The general form of these expressions is the sum of an empirical risk plus a complexity penalty based on the number of parameters. Many shrinkage methods are known to satisfy an ‘oracle’ property meaning that asymptotically they select the correct variables and estimate their coefficients efficiently. In Section 1.2, we show oracle properties in two general settings. The first uses a log likelihood in place of the empirical risk and allows a general class of penalties. The ...


Analyzing Competitive Balance In Professional Sport, Kevin Alwell 2020 University of Connecticut - Storrs

Analyzing Competitive Balance In Professional Sport, Kevin Alwell

Honors Scholar Theses

In this paper we review several measures to statistically analyze competitive balance and report which leagues have a wider variance of performance amongst its competitors. Each league seeks to maintain high levels of parity, making matches and overall season more unpredictable and appealing to the general audience. Here we quantify competitive advantage across major sports leagues in numbers using several statistical methods in order for leagues to optimize their revenue.


Life And Death: Quantifying The Risk Of Heart Disease With Machine Learning, Jack Scott Glienke 2020 University of Northern Iowa

Life And Death: Quantifying The Risk Of Heart Disease With Machine Learning, Jack Scott Glienke

Honors Program Theses

Coronary heart disease has long been a key area of focus in the discussion of public health. As such, numerous studies have been conducted throughout history with the sole intention of identifying risk factors leading to the onset of cardiovascular conditions. A plethora of statistical procedures can be used to identify an individual’s risk of developing heart disease, yet regression models tend to be the default tool used by researchers. Using the data obtained from the most influential cardiovascular study to date, the Framingham Heart Study, this analysis uses machine learning techniques to generate and test the predictive power ...


The Effects Of Zoledronate And Sleep Deprivation On The Distal Femur Trabecular Thickness Of Ovariectomized Rats: Application Of Different Statistical Methods, Erin Nolte 2020 Chapman University

The Effects Of Zoledronate And Sleep Deprivation On The Distal Femur Trabecular Thickness Of Ovariectomized Rats: Application Of Different Statistical Methods, Erin Nolte

Student Scholar Symposium Abstracts and Posters

Osteoporosis is a disease that causes the degradation of bone, leading to an increased risk of fracture. 1 in 3 women over the age of 50 will be affected by Osteoporosis. This study aims to understand how bone is affected by sleep deprivation in estrogen-deficient rats, and how Zoledronate might negate the inimical effects of sleep deprivation on bone. As bone mineral density (BMD) is a crude evaluation of the architectural changes seen in Osteoporosis, trabecular thickness may serve as a better single evaluation of bone health. 31 Wistar female rats were ovariectomized and separated into 4 random groups. The ...


Fitting Of Lotka-Volterra Model For Coupled Population Growth Data Through Least-Squares Estimation Of Parameters, Jessica Ann Harter 2020 University of Wisconsin-Milwaukee

Fitting Of Lotka-Volterra Model For Coupled Population Growth Data Through Least-Squares Estimation Of Parameters, Jessica Ann Harter

Theses and Dissertations

The population of two types of bacteria found in the Gulf Coast of Florida, V.chagasii and V. harveyi, can be described by the Lotka-Voltera competition model. Using data gathered in experiments conducted by Bury and Pickett (2015), we take a different approach to find parameter estimates using numerical methods in R. In particular, we find a numerical solution to the coupled set of ODEs and minimize the sum of squared errors in order to obtain the optimal parameter estimates that will fit the data best. In order to get a sense of accuracy of these parameter estimates, we use ...


Smoothed Quantiles For Claim Frequency Models, With Applications To Risk Measurement, Ponmalar Suruliraj Ratnam 2020 University of Wisconsin-Milwaukee

Smoothed Quantiles For Claim Frequency Models, With Applications To Risk Measurement, Ponmalar Suruliraj Ratnam

Theses and Dissertations

Statistical models for the claim severity and claim frequency variables are routinely constructed and utilized by actuaries. Typical applications of such models include identification of optimal deductibles for selected loss elimination ratios, pricing of contract layers, determining credibility factors, risk and economic capital measures, and evaluation of effects of inflation, market trends and other quantities arising in insurance. While the actuarial literature on the severity models is extensive and rapidly growing, that for the claim frequency models lags behind. One of the reasons for such a gap is that various actuarial metrics do not possess ``nice'' statistical properties for the ...


Using Saddlepoint Approximations And Likelihood-Based Methods To Conduct Statistical Inference For The Mean Of The Beta Distribution, Bryn Brakefield 2020 Stephen F. Austin State University

Using Saddlepoint Approximations And Likelihood-Based Methods To Conduct Statistical Inference For The Mean Of The Beta Distribution, Bryn Brakefield

Electronic Theses and Dissertations

The prevalence of conducting statistical inference for the mean of the beta distribution has been rising in various fields of academic research, such as in immunology that analyzes proportions of rare cell population subsets. For our purposes, we will address this statistical inference problem by using likelihood-based applications to hypothesis testing, along with a relatively new statistical method called saddlepoint approximations. Through simulation work, we will compare the performance of these statistical procedures and provide both the statistical and scientific communities with recommendations on best practices.


Demystification Of Graph And Information Entropy, Bryce Frederickson 2020 Utah State University

Demystification Of Graph And Information Entropy, Bryce Frederickson

Undergraduate Honors Capstone Projects

Shannon entropy is an information-theoretic measure of unpredictability in probabilistic models. Recently, it has been used to form a tool, called the von Neumann entropy, to study quantum mechanics and network flows by appealing to algebraic properties of graph matrices. But still, little is known about what the von Neumann entropy says about the combinatorial structure of the graphs themselves. This paper gives a new formulation of the von Neumann entropy that describes it as a rate at which random movement settles down in a graph. At the same time, this new perspective gives rise to a generalization of von ...


Gait Characterization Using Computer Vision Video Analysis, Martha T. Gizaw 2020 College of William and Mary

Gait Characterization Using Computer Vision Video Analysis, Martha T. Gizaw

Undergraduate Honors Theses

The World Health Organization reports that falls are the second-leading cause of accidental death among senior adults around the world. Currently, a research team at William & Mary’s Department of Kinesiology & Health Sciences attempts to recognize and correct aging-related factors that can result in falling. To meet this goal, the members of that team videotape walking tests to examine individual gait parameters of older subjects. However, they undergo a slow, laborious process of analyzing video frame by video frame to obtain such parameters. This project uses computer vision software to reconstruct walking models from residents of an independent living retirement ...


Modeling Movement: A Machine-Learning Approach To Track Migration Routes After Displacement, Ethan Harrison 2020 William & Mary

Modeling Movement: A Machine-Learning Approach To Track Migration Routes After Displacement, Ethan Harrison

Undergraduate Honors Theses

Over the past decade, the number of individuals internally displaced by conflict (IDPs) has reached unprecedented levels. Humanitarian actors and first-responders face persistent information gaps in meeting the needs of these populations. Specifically, they face challenges in understanding where and how IDPs move after they are displaced, which is necessary to locate them in conflict-affected situations and provide them with life-saving assistance. In this paper, I propose a framework, using established machine-learning methods, to forecast the migration routes of these displaced populations (Chapter 1). In a case study of displacement in Yemen, my models predict 80% of IDPs' migration routes ...


Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim 2020 Washington University in St. Louis

Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim

Engineering and Applied Science Theses & Dissertations

Electronic Health Records (EHR) are widely adopted and used throughout healthcare systems and are able to collect and store longitudinal information data that can be used to describe patient phenotypes. From the underlying data structures used in the EHR, discrete data can be extracted and analyzed to improve patient care and outcomes via tasks such as risk stratification and prospective disease management. Temporality in EHR is innately present given the nature of these data, however, and traditional classification models are limited in this context by the cross- sectional nature of training and prediction processes. Finding temporal patterns in EHR is ...


The Two Types Of Society: Computationally Revealing Recurrent Social Formations And Their Evolutionary Trajectories, Lux Miranda 2020 Utah State University

The Two Types Of Society: Computationally Revealing Recurrent Social Formations And Their Evolutionary Trajectories, Lux Miranda

Undergraduate Honors Capstone Projects

Comparative social science has a long history of attempts to classify societies and cultures in terms of shared characteristics. However, only recently has it become feasible to conduct quantitative analysis of large historical datasets to mathematically approach the study of social complexity and classify shared societal characteristics. Such methods have the potential to identify recurrent social formations in human societies and contribute to social evolutionary theory. However, in order to achieve this potential, repeated studies are needed to assess the robustness of results to changing methods and data sets. Using an improved derivative of the Seshat: Global History Databank, we ...


Simplicity As A New Environmental Virtue, Justin Wheeler 2020 Utah State University

Simplicity As A New Environmental Virtue, Justin Wheeler

Undergraduate Honors Capstone Projects

This paper argues for the addition of a new environmentally focused virtue, simplicity, to the virtue ethical framework developed by Aristotle. First, relevant background from Aristotle’s virtue ethics are developed including the crucial, “doctrine of the mean”, a balance between excess and deficiency of a specified character trait. The tenets of the new virtue simplicity are developed with practical examples based on Aristotle’s method of developing a virtue of character. Simplicity is proposed as a desire to take the appropriate amount from the natural world and an acceptance of one’s circumstances. Those possessing simplicity will not fall ...


Analysis Of Sat And Isat Scores For Madison School District In Rexburg, Idaho, Holly Dawn Palmer 2020 Utah State University

Analysis Of Sat And Isat Scores For Madison School District In Rexburg, Idaho, Holly Dawn Palmer

Undergraduate Honors Capstone Projects

Testing is an integral part of measuring education. If used properly SAT scores can be compared across the nation, and statewide tests can compare different school districts to each other if done properly to avoid certain pitfalls (Fetler, 1991). However, if tests do not have a significant impact on a student, their motivation to take the test will be low and test quality cannot be assumed. When the state funds two separate tests for their students but only one has a significant impact on the student, how should the scores for each test be used, and is it okay to ...


Ragweed And Sagebrush Pollen Can Distinguish Between Vegetation Types At Broad Spatial Scales, Hannah M. Carroll, Alan D. Wanamaker, Lynn G. Clark, Brian J. Wilsey 2020 Iowa State University

Ragweed And Sagebrush Pollen Can Distinguish Between Vegetation Types At Broad Spatial Scales, Hannah M. Carroll, Alan D. Wanamaker, Lynn G. Clark, Brian J. Wilsey

Ecology, Evolution and Organismal Biology Publications

Patterns of vegetation distribution at regional to subcontinental scales can inform understanding of climate. Delineating ecoregion boundaries over geologic time is complicated by the difficulty of distinguishing between prairie types at broad spatial scales using the pollen record. Pollen ratios are sometimes employed to distinguish between vegetation types, although their applicability is often limited to a geographic range. The Neotoma Paleoecology Database offers an unparalleled opportunity to synthesize a large number of pollen datasets. Ambrosia (ragweed) is a genus of mesic‐adapted species sensitive to summer moisture. Artemisia (sagebrush, wormwood, mugwort) is a genus of dry‐mesic‐adapted species resilient ...


On Arnold–Villasenor Conjectures For Characterizaing Exponential Distribution Based On Sample Of Size Three, George Yanev 2020 The University of Texas Rio Grande Valley

On Arnold–Villasenor Conjectures For Characterizaing Exponential Distribution Based On Sample Of Size Three, George Yanev

Mathematical and Statistical Sciences Faculty Publications and Presentations

Arnold and Villasenor [4] obtain a series of characterizations of the exponential distribution based on random samples of size two. These results were already applied in constructing goodness-of-fit tests. Extending the techniques from [4], we prove some of Arnold and Villasenor’s conjectures for samples of size three. An example with simulated data is discussed.


Applications Of Machine Learning In High-Frequency Trade Direction Classification, Jared E. Hansen 2020 Utah State University

Applications Of Machine Learning In High-Frequency Trade Direction Classification, Jared E. Hansen

All Graduate Theses and Dissertations

The correct assignment of trades as buyer-initiated or seller-initiated is paramount in many quantitative finance studies. Simple decision rule methods have been used for signing trades since many data sets available to researchers do not include the sign of each trade executed. By utilizing these decision rule methods, as well as engineering new variables from available data, we have demonstrated that machine learning models outperform prior methods for accurately signing trades as buys and sells, achieving state-of-the-art results. The best model developed was 4.5 percentage points more accurate than older methods when predicting onto unseen data. Since finance and ...


Digital Commons powered by bepress