Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

2020

Discipline
Institution
Keyword
Publication
Publication Type

Articles 61 - 84 of 84

Full-Text Articles in Statistical Models

Predicting Diabetes Diagnoses, Sarah Netchert Jan 2020

Predicting Diabetes Diagnoses, Sarah Netchert

Student Research Poster Presentations 2020

This study explored the traits and health state of African Americans in central Virginia in order to determine what traits put people at a higher probability of being diagnosed with diabetes. We also want to know which traits will generate the highest probability a person will be diagnosed with diabetes. Traits that were included and used in this study were cholesterol, stabilized glucose, high density lipoprotein levels, age(years), gender, height(inches), weight(pounds), systolic blood pressure, diastolic blood pressure, waist size(inches), and hip size(inches). There were 403 individuals included in study since they were only ones screened for diabetes out of 1,046 …


Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang Jan 2020

Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang

The University of Michigan Department of Biostatistics Working Paper Series

This paper outlines a new class of shrinkage priors for Bayesian isotonic regression modeling a binary outcome against a predictor, where the probability of the outcome is assumed to be monotonically non-decreasing with the predictor. The predictor is categorized into a large number of groups, and the set of differences between outcome probabilities in consecutive categories is equipped with a multivariate prior having support over the set of simplexes. The Dirichlet distribution, which can be derived from a normalized cumulative sum of gamma-distributed random variables, is a natural choice of prior, but using mathematical and simulation-based arguments, we show that …


The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling Jan 2020

The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling

Theses and Dissertations

Diversity of intrinsic neural attributes and network connections is known to exist in many areas of the brain and is thought to significantly affect neural coding. Recent theoretical and experimental work has argued that in uncoupled networks, coding is most accurate at intermediate levels of heterogeneity. I explore this phenomenon through two distinct approaches: a theoretical mathematical modeling approach and a data-driven statistical modeling approach.

Through the mathematical approach, I examine firing rate heterogeneity in a feedforward network of stochastic neural oscillators utilizing a high-dimensional model. The firing rate heterogeneity stems from two sources: intrinsic (different individual cells) and network …


Multi-Variable Theme Park Analysis, Timothy Johnson Jan 2020

Multi-Variable Theme Park Analysis, Timothy Johnson

Research and Scholarship Symposium Posters

The purpose of this data analysis is to better understand how analytics can be used to look at different problems such as the one indicated in this multi-variable question. When is the best time of year to visit the theme parks at Walt Disney World? The analysis aims to look at multiple factors that will influence the answer to the question. The goal is to look at these multiple factors and run some probability and statistics tests in order to come to a conclusion on the data. This data along with other key factors can lead to the best possible …


Measuring The Connective Action Of Black Lives Matter Activists: A Psychometric Investigation Into Twitter Data, Paige Alfonzo Jan 2020

Measuring The Connective Action Of Black Lives Matter Activists: A Psychometric Investigation Into Twitter Data, Paige Alfonzo

Electronic Theses and Dissertations

Many protest movements from the last twenty-first century have become increasingly networked and personalized. Several scholars have tapped into this change coining terms such as participatory action, digitally mediated action, computer-mediated communication, issue-based organization, and what I focus on in this project, connective action. Building on the ideas percolating across the literary landscape at the time, Bennett and Segerberg (2012) introduced the logic of connective action based on emergent characteristics they observed in post-2010 large-scale social movements. Both the logic of connective action and related work have become deeply ingrained in today's social movement scholarship. As such, I felt it …


Assessing Robustness Of The Rasch Mixture Model To Detect Differential Item Functioning - A Monte Carlo Simulation Study, Jinjin Huang Jan 2020

Assessing Robustness Of The Rasch Mixture Model To Detect Differential Item Functioning - A Monte Carlo Simulation Study, Jinjin Huang

Electronic Theses and Dissertations

Measurement invariance is crucial for an effective and valid measure of a construct. Invariance holds when the latent trait varies consistently across subgroups; in other words, the mean differences among subgroups are only due to true latent ability differences. Differential item functioning (DIF) occurs when measurement invariance is violated. There are two kinds of traditional tools for DIF detection: non-parametric methods and parametric methods. Mantel Haenszel (MH), SIBTEST, and standardization are examples of non-parametric DIF detection methods. The majority of parametric DIF detection methods are item response theory (IRT) based. Both non-parametric methods and parametric methods compare differences among subgroups …


An Examination Of Covid-19 Statistical Modeling, Shane Vaughan Jan 2020

An Examination Of Covid-19 Statistical Modeling, Shane Vaughan

Williams Honors College, Honors Research Projects

The 2019 novel coronavirus, also known as COVID-19, is an infectious disease which was first reported in late 2019 and soon spread to become a global pandemic, prompting major action from world governments. Soon after, many institutions began attempts to analyze and predict the spread and severity of the disease via statistical modeling. Some information is not available for public consumption; however, a number of institutions have published the results of their analyses and some have made public repositories of the code used to build the models. This research paper attempts use these and other resources to examine the modeling …


Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero Jan 2020

Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero

Theses and Dissertations

Compositional data (CD) is mostly analyzed as relative data, using ratios of components, and log-ratio transformations to be able to use known multivariable statistical methods. Therefore, CD where some components equal zero represent a problem. Furthermore, when the data is measured longitudinally, observations are spatially related and appear to come from a mixture population, the analysis becomes highly complex. For this matter, a two-part model was proposed to deal with structural zeros in longitudinal CD using a mixed-effects model. Furthermore, the model has been extended to the case where the non-zero components of the vector might a two component mixture …


K-Means Stock Clustering Analysis Based On Historical Price Movements And Financial Ratios, Shu Bin Jan 2020

K-Means Stock Clustering Analysis Based On Historical Price Movements And Financial Ratios, Shu Bin

CMC Senior Theses

The 2015 article Creating Diversified Portfolios Using Cluster Analysis proposes an algorithm that uses the Sharpe ratio and results from K-means clustering conducted on companies' historical financial ratios to generate stock market portfolios. This project seeks to evaluate the performance of the portfolio-building algorithm during the beginning period of the COVID-19 recession. S&P 500 companies' historical stock price movement and their historical return on assets and asset turnover ratios are used as dissimilarity metrics for K-means clustering. After clustering, stock with the highest Sharpe ratio from each cluster is picked to become a part of the portfolio. The economic and …


A Mathematical Model For Malaria With Age-Heterogeneous Biting Rate, Sho Kawakami Jan 2020

A Mathematical Model For Malaria With Age-Heterogeneous Biting Rate, Sho Kawakami

All Graduate Theses, Dissertations, and Other Capstone Projects

We propose a mathematical model for malaria with age-heterogeneous biting rate from mosquitos. The existence of the model, the local behavior of the disease free equilibrium are explored. Furthermore the model is extended to an optimal control problem and the corresponding adjoint equations and optimality conditions are derived. Age dependent parameter values are estimated and numerical simulations are carried out for the model. The new model better accounts for difference in biting rates of mosquitos to different age groups, and improvements in stability to the explicit algorithm. The optimal control is also shown to depend on the age distribution of …


Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, Isabella Gollini, Alberto Caimo, Paolo Campana Jan 2020

Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, Isabella Gollini, Alberto Caimo, Paolo Campana

Articles

Illegal markets are notoriously difficult to study. Police data offer an increasingly exploited source of evidence. However, their secondary nature poses challenges for researchers. A key issue is that researchers often have to deal with two sets of actors: targeted and non-targeted. This work develops a latent space model for interdependent ego-networks purposely created to deal with the targeted nature of police evidence. By treating targeted offenders as egos and their contacts as alters, the model (a) leverages on the full information available and (b) mirrors the specificity of the data collection strategy. The paper then applies this approach to …


Statistical Analysis Of Demographic Effects On Insurance Coverage Of Perinatal And Neonatal Morbidity, Madeline Durbin Jan 2020

Statistical Analysis Of Demographic Effects On Insurance Coverage Of Perinatal And Neonatal Morbidity, Madeline Durbin

Undergraduate Honors Thesis Projects

In the United States of America, Ohio has one of the worst neonatal and perinatal death rates. Within Ohio, Montgomery County has an above average neonatal and perinatal death rate. This statistic can be lowered if more women in Montgomery County have health insurance. They would be more likely to seek out prenatal health care, since they would no longer have to pay as much money out-of-pocket. This would allow medical professionals to be able to diagnose and treat any potential issues in the mother or child earlier. Having health insurance would also prevent mothers-to-be from seeking out other potentially …


Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li Jan 2020

Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li

Theses and Dissertations--Statistics

Comparing the distribution of biomarker measurements between two groups under either an unpaired or paired design is a common goal in many biomarker studies. However, analyzing biomarker data is sometimes challenging because the data may not be normally distributed and contain a large fraction of zero values or missing values. Although several statistical methods have been proposed, they either require data normality assumption, or are inefficient. We proposed a novel two-part semiparametric method for data under an unpaired setting and a nonparametric method for data under a paired setting. The semiparametric method considers a two-part model, a logistic regression for …


Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu Jan 2020

Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu

Theses and Dissertations--Statistics

The Bayesian adjustment for confounding (BAC) is a Bayesian model averaging method to select and adjust for confounding factors when evaluating the average causal effect of an exposure on a certain outcome. We extend the BAC method to time-to-event outcomes. Specifically, the posterior distribution of the exposure effect on a time-to-event outcome is calculated as a weighted average of posterior distributions from a number of candidate proportional hazards models, weighing each model by its ability to adjust for confounding factors. The Bayesian Information Criterion based on the partial likelihood is used to compare different models and approximate the Bayes factor. …


Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee Jan 2020

Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee

Theses and Dissertations

Within-person data can exhibit a virtually limitless variety of statistical patterns, but it can be difficult to distinguish meaningful features from statistical artifacts. Studies of complex traits have previously used genetic signals like twin-based heritability to distinguish between the two. This dissertation is a collection of studies applying state-space modeling to conceptualize and estimate novel phenotypic constructs for use in psychiatric research and further biometrical genetic analysis. The aims are to: (1) relate control theoretic concepts to health-related phenotypes; (2) design statistical models that formally define those phenotypes; (3) estimate individual phenotypic values from time series data; (4) consider hierarchical …


Aggregate Loss Model With Poisson-Tweedie Loss Frequency, Si Chen Jan 2020

Aggregate Loss Model With Poisson-Tweedie Loss Frequency, Si Chen

Theses and Dissertations (Comprehensive)

The aggregate loss model has applications in various areas such as financial risk management and actuarial science. The aggregate loss is the summation of all random losses occurred in a period, and it is governed by both the loss severity and the loss frequency. While the impact of the loss severity on aggregate loss is well studied, less focus is paid on the influence of loss frequency on aggregate loss, which motivates our study. In this thesis, we enrich the aggregate loss framework by introducing the Poisson-Tweedie distribution as a candidate for modelling loss frequency, prove the closedness of Poisson-Tweedie …


Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu Jan 2020

Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu

Theses and Dissertations--Statistics

A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …


Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou Jan 2020

Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou

Theses and Dissertations--Statistics

Statistical intervals (e.g., confidence, prediction, or tolerance) are widely used to quantify uncertainty, but complex settings can create challenges to obtain such intervals that possess the desired properties. My thesis will address diverse data settings and approaches that are shown empirically to have good performance. We first introduce a focused treatment on using a single-layer bootstrap calibration to improve the coverage probabilities of two-sided parametric tolerance intervals for non-normal distributions. We then turn to zero-inflated data, which are commonly found in, among other areas, pharmaceutical and quality control applications. However, the inference problem often becomes difficult in the presence of …


Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang Jan 2020

Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang

Theses and Dissertations--Statistics

Kinetic modeling of the time dependence of metabolite concentrations including the unstable isotope labeled species is an important approach to simulate metabolic pathway dynamics. It is also essential for quantitative metabolic flux analysis using tracer data. However, as the metabolic networks are complex including extensive compartmentation and interconnections, the parameter estimation for enzymes that catalyze individual reactions needed for kinetic modeling is challenging. As the pa- rameter space is large and multi-dimensional while kinetic data are comparatively sparse, the estimation procedure (especially the point estimation methods) often en- counters multiple local maximum such that standard maximum likelihood methods may yield …


Modeling The Galactic Compact Binary Neutron Star Population And Studying The Double Pulsar System, Nihan Pol Jan 2020

Modeling The Galactic Compact Binary Neutron Star Population And Studying The Double Pulsar System, Nihan Pol

Graduate Theses, Dissertations, and Problem Reports

Binary neutron star (BNS) systems consisting of at least one neutron star provide an avenue for testing a broad range of physical phenomena ranging from tests of General Relativity to probing magnetospheric physics to understanding the behavior of matter in the densest environments in the Universe. Ultra-compact BNS systems with orbital periods less than few tens of minutes emit gravitational waves with frequencies ~mHz and are detectable by the planned space-based Laser Interferometer Space Antenna (LISA), while merging BNS systems produce a chirping gravitational wave signal that can be detected by the ground-based Laser Interferometer Gravitational-Wave Observatory (LIGO). Thus, BNS …


Projecting Regions Of North Atlantic Right Whale, Eubalaena Glacialis, Habitat Suitability In The Gulf Of Maine In 2050, Camille Ross Jan 2020

Projecting Regions Of North Atlantic Right Whale, Eubalaena Glacialis, Habitat Suitability In The Gulf Of Maine In 2050, Camille Ross

Honors Theses

North Atlantic right whales (Eubalaena glacialis) are endangered. Understanding the role environmental conditions play in habitat suitability is key to determining the regions in need of protection for conservation of the species, particularly as climate change shifts suitable habitat. This thesis uses three species distribution modeling algorithms, together with historical data on whale abundance(1993 to 2009) and environmental covariates to build monthly ensemble models of past E. glacialis habitat suitability in the Gulf of Maine. Then, the models are projected onto the year 2050 for a range of climate scenarios. Specifically, the distribution of the species was modeled …


Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick Jan 2020

Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick

School of Mathematical and Statistical Sciences Faculty Publications and Presentations

AIMS:

Over 1/3 of Americans have prediabetes, while 9.4% have type 2 diabetes. The aim of our study was to estimate the prevalence of prediabetes in Mexican Americans, with known 28.2% prevalence of type 2 diabetes, by age and sex and to identify critical socio-demographic and clinical factors associated with prediabetes.

METHODS:

Data were collected between 2004 and 2017 from the Cameron County Hispanic Cohort in Texas. Weighted crude and sex- and age- stratified prevalences were calculated. Survey weighted logistic regression analyses were conducted to identify risk factors for prediabetes.

RESULTS:

The prevalence of prediabetes (32%) was slightly higher than …


Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, Stuart A. Batterman, Veronica J. Berrocal, Chad Milando, Owais Gilani, Saravanan Arunachalam, K. Max Zhang Jan 2020

Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, Stuart A. Batterman, Veronica J. Berrocal, Chad Milando, Owais Gilani, Saravanan Arunachalam, K. Max Zhang

Faculty Journal Articles

Research Report 202 describes a study led by Dr. Stuart Batterman at the University of Michigan, Ann Arbor and colleagues. The investigators evaluated the ability to predict traffic-related air pollution using a variety of methods and models, including a line source air pollution dispersion model and sophisticated spatiotemporal Bayesian data fusion methods. Exposure assessment for traffic-related air pollution is challenging because the pollutants are a complex mixture and vary greatly over space and time. Because extensive direct monitoring is difficult and expensive, a number of modeling approaches have been developed, but each model has its own limitations and errors.

Dr. …


How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller Jan 2020

How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller

CMC Senior Theses

In this paper I will be breaking down a scholarly article, written by Sameer K. Deshpande and Shane T. Jensen, that proposed a new method to evaluate NBA players. The NBA is the highest level professional basketball league in America and stands for the National Basketball Association. They proposed to build a model that would result in how NBA players impact their teams chances of winning a game, using machine learning and probability concepts. I preface that by diving into these concepts and their mathematical backgrounds. These concepts include building a linear model using ordinary least squares method, the bias …