Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Applied Statistics (33)
- Statistical Methodology (23)
- Biostatistics (18)
- Social and Behavioral Sciences (14)
- Applied Mathematics (11)
-
- Multivariate Analysis (11)
- Business (9)
- Life Sciences (9)
- Engineering (8)
- Probability (8)
- Longitudinal Data Analysis and Time Series (7)
- Medicine and Health Sciences (7)
- Statistical Theory (7)
- Computer Sciences (6)
- Data Science (6)
- Environmental Sciences (6)
- Mathematics (5)
- Other Applied Mathematics (5)
- Other Statistics and Probability (5)
- Social Statistics (5)
- Earth Sciences (4)
- Economics (4)
- Survival Analysis (4)
- Theory and Algorithms (4)
- Artificial Intelligence and Robotics (3)
- Bioinformatics (3)
- Business Analytics (3)
- Institution
-
- Western University (6)
- Kennesaw State University (5)
- University of Kentucky (5)
- Southern Methodist University (4)
- Virginia Commonwealth University (4)
-
- The University of Southern Mississippi (3)
- University of Arkansas, Fayetteville (3)
- Claremont Colleges (2)
- Florida International University (2)
- Illinois State University (2)
- Misericordia University (2)
- Purdue University (2)
- SUNY Geneseo (2)
- Technological University Dublin (2)
- University of Denver (2)
- University of Massachusetts Amherst (2)
- University of Nebraska - Lincoln (2)
- Washington University in St. Louis (2)
- Western Kentucky University (2)
- Air Force Institute of Technology (1)
- Bucknell University (1)
- COBRA (1)
- California Polytechnic State University, San Luis Obispo (1)
- Colby College (1)
- Concordia University St. Paul (1)
- James Madison University (1)
- Louisiana State University (1)
- Minnesota State University, Mankato (1)
- Missouri State University (1)
- Murray State University (1)
- Keyword
-
- Statistics (8)
- Machine learning (3)
- Analytics (2)
- COVID-19 (2)
- Deep Learning (2)
-
- Imbalance (2)
- Modeling (2)
- Morgridge College of Education (2)
- NBA (2)
- Random forest (2)
- Research Methods and Information Science (2)
- Research Methods and Statistics (2)
- Risk modeling (2)
- Simulation (2)
- Small area estimation (2)
- Time series (2)
- 4/2 model (1)
- AR(1) (1)
- AUC (1)
- Age-demographic model (1)
- Aggregate loss (1)
- Alpha (1)
- Alzheimer’s disease (1)
- American ginseng (1)
- Antimicrobial Resistance (1)
- Appalachia (1)
- Astrophysics (1)
- Atlantic surfclam (1)
- Autoencoders (1)
- Average Causal Effect (1)
- Publication
-
- Electronic Thesis and Dissertation Repository (6)
- Theses and Dissertations--Statistics (5)
- Published and Grey Literature from PhD Candidates (4)
- Electronic Theses and Dissertations (3)
- Graduate Theses and Dissertations (3)
-
- Theses and Dissertations (3)
- Annual Symposium on Biomathematics and Ecology Education and Research (2)
- Articles (2)
- CMC Senior Theses (2)
- Dissertations (2)
- Doctoral Dissertations (2)
- FIU Electronic Theses and Dissertations (2)
- GREAT Day Posters (2)
- Master's Theses (2)
- Masters Theses & Specialist Projects (2)
- SMU Data Science Review (2)
- Statistical Science Theses and Dissertations (2)
- Student Research Poster Presentations 2020 (2)
- The Journal of Purdue Undergraduate Research (2)
- Access*: Interdisciplinary Journal of Student Research and Scholarship (1)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (1)
- All Graduate Theses, Dissertations, and Other Capstone Projects (1)
- Basic Science Engineering (1)
- Biology and Medicine Through Mathematics Conference (1)
- Civil and Architectural Engineering Faculty Research (1)
- Department of Statistics: Dissertations, Theses, and Student Work (1)
- Doctor of Data Science and Analytics Dissertations (1)
- Electrical & Systems Engineering Publications and Presentations (1)
- English Language Institute (1)
- Faculty Journal Articles (1)
- Publication Type
Articles 61 - 84 of 84
Full-Text Articles in Statistical Models
Predicting Diabetes Diagnoses, Sarah Netchert
Predicting Diabetes Diagnoses, Sarah Netchert
Student Research Poster Presentations 2020
This study explored the traits and health state of African Americans in central Virginia in order to determine what traits put people at a higher probability of being diagnosed with diabetes. We also want to know which traits will generate the highest probability a person will be diagnosed with diabetes. Traits that were included and used in this study were cholesterol, stabilized glucose, high density lipoprotein levels, age(years), gender, height(inches), weight(pounds), systolic blood pressure, diastolic blood pressure, waist size(inches), and hip size(inches). There were 403 individuals included in study since they were only ones screened for diabetes out of 1,046 …
Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang
Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang
The University of Michigan Department of Biostatistics Working Paper Series
This paper outlines a new class of shrinkage priors for Bayesian isotonic regression modeling a binary outcome against a predictor, where the probability of the outcome is assumed to be monotonically non-decreasing with the predictor. The predictor is categorized into a large number of groups, and the set of differences between outcome probabilities in consecutive categories is equipped with a multivariate prior having support over the set of simplexes. The Dirichlet distribution, which can be derived from a normalized cumulative sum of gamma-distributed random variables, is a natural choice of prior, but using mathematical and simulation-based arguments, we show that …
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling
Theses and Dissertations
Diversity of intrinsic neural attributes and network connections is known to exist in many areas of the brain and is thought to significantly affect neural coding. Recent theoretical and experimental work has argued that in uncoupled networks, coding is most accurate at intermediate levels of heterogeneity. I explore this phenomenon through two distinct approaches: a theoretical mathematical modeling approach and a data-driven statistical modeling approach.
Through the mathematical approach, I examine firing rate heterogeneity in a feedforward network of stochastic neural oscillators utilizing a high-dimensional model. The firing rate heterogeneity stems from two sources: intrinsic (different individual cells) and network …
Multi-Variable Theme Park Analysis, Timothy Johnson
Multi-Variable Theme Park Analysis, Timothy Johnson
Research and Scholarship Symposium Posters
The purpose of this data analysis is to better understand how analytics can be used to look at different problems such as the one indicated in this multi-variable question. When is the best time of year to visit the theme parks at Walt Disney World? The analysis aims to look at multiple factors that will influence the answer to the question. The goal is to look at these multiple factors and run some probability and statistics tests in order to come to a conclusion on the data. This data along with other key factors can lead to the best possible …
Measuring The Connective Action Of Black Lives Matter Activists: A Psychometric Investigation Into Twitter Data, Paige Alfonzo
Measuring The Connective Action Of Black Lives Matter Activists: A Psychometric Investigation Into Twitter Data, Paige Alfonzo
Electronic Theses and Dissertations
Many protest movements from the last twenty-first century have become increasingly networked and personalized. Several scholars have tapped into this change coining terms such as participatory action, digitally mediated action, computer-mediated communication, issue-based organization, and what I focus on in this project, connective action. Building on the ideas percolating across the literary landscape at the time, Bennett and Segerberg (2012) introduced the logic of connective action based on emergent characteristics they observed in post-2010 large-scale social movements. Both the logic of connective action and related work have become deeply ingrained in today's social movement scholarship. As such, I felt it …
Assessing Robustness Of The Rasch Mixture Model To Detect Differential Item Functioning - A Monte Carlo Simulation Study, Jinjin Huang
Assessing Robustness Of The Rasch Mixture Model To Detect Differential Item Functioning - A Monte Carlo Simulation Study, Jinjin Huang
Electronic Theses and Dissertations
Measurement invariance is crucial for an effective and valid measure of a construct. Invariance holds when the latent trait varies consistently across subgroups; in other words, the mean differences among subgroups are only due to true latent ability differences. Differential item functioning (DIF) occurs when measurement invariance is violated. There are two kinds of traditional tools for DIF detection: non-parametric methods and parametric methods. Mantel Haenszel (MH), SIBTEST, and standardization are examples of non-parametric DIF detection methods. The majority of parametric DIF detection methods are item response theory (IRT) based. Both non-parametric methods and parametric methods compare differences among subgroups …
An Examination Of Covid-19 Statistical Modeling, Shane Vaughan
An Examination Of Covid-19 Statistical Modeling, Shane Vaughan
Williams Honors College, Honors Research Projects
The 2019 novel coronavirus, also known as COVID-19, is an infectious disease which was first reported in late 2019 and soon spread to become a global pandemic, prompting major action from world governments. Soon after, many institutions began attempts to analyze and predict the spread and severity of the disease via statistical modeling. Some information is not available for public consumption; however, a number of institutions have published the results of their analyses and some have made public repositories of the code used to build the models. This research paper attempts use these and other resources to examine the modeling …
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero
Theses and Dissertations
Compositional data (CD) is mostly analyzed as relative data, using ratios of components, and log-ratio transformations to be able to use known multivariable statistical methods. Therefore, CD where some components equal zero represent a problem. Furthermore, when the data is measured longitudinally, observations are spatially related and appear to come from a mixture population, the analysis becomes highly complex. For this matter, a two-part model was proposed to deal with structural zeros in longitudinal CD using a mixed-effects model. Furthermore, the model has been extended to the case where the non-zero components of the vector might a two component mixture …
K-Means Stock Clustering Analysis Based On Historical Price Movements And Financial Ratios, Shu Bin
K-Means Stock Clustering Analysis Based On Historical Price Movements And Financial Ratios, Shu Bin
CMC Senior Theses
The 2015 article Creating Diversified Portfolios Using Cluster Analysis proposes an algorithm that uses the Sharpe ratio and results from K-means clustering conducted on companies' historical financial ratios to generate stock market portfolios. This project seeks to evaluate the performance of the portfolio-building algorithm during the beginning period of the COVID-19 recession. S&P 500 companies' historical stock price movement and their historical return on assets and asset turnover ratios are used as dissimilarity metrics for K-means clustering. After clustering, stock with the highest Sharpe ratio from each cluster is picked to become a part of the portfolio. The economic and …
A Mathematical Model For Malaria With Age-Heterogeneous Biting Rate, Sho Kawakami
A Mathematical Model For Malaria With Age-Heterogeneous Biting Rate, Sho Kawakami
All Graduate Theses, Dissertations, and Other Capstone Projects
We propose a mathematical model for malaria with age-heterogeneous biting rate from mosquitos. The existence of the model, the local behavior of the disease free equilibrium are explored. Furthermore the model is extended to an optimal control problem and the corresponding adjoint equations and optimality conditions are derived. Age dependent parameter values are estimated and numerical simulations are carried out for the model. The new model better accounts for difference in biting rates of mosquitos to different age groups, and improvements in stability to the explicit algorithm. The optimal control is also shown to depend on the age distribution of …
Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, Isabella Gollini, Alberto Caimo, Paolo Campana
Modelling Interactions Among Offenders: A Latent Space Approach For Interdependent Ego-Networks, Isabella Gollini, Alberto Caimo, Paolo Campana
Articles
Illegal markets are notoriously difficult to study. Police data offer an increasingly exploited source of evidence. However, their secondary nature poses challenges for researchers. A key issue is that researchers often have to deal with two sets of actors: targeted and non-targeted. This work develops a latent space model for interdependent ego-networks purposely created to deal with the targeted nature of police evidence. By treating targeted offenders as egos and their contacts as alters, the model (a) leverages on the full information available and (b) mirrors the specificity of the data collection strategy. The paper then applies this approach to …
Statistical Analysis Of Demographic Effects On Insurance Coverage Of Perinatal And Neonatal Morbidity, Madeline Durbin
Statistical Analysis Of Demographic Effects On Insurance Coverage Of Perinatal And Neonatal Morbidity, Madeline Durbin
Undergraduate Honors Thesis Projects
In the United States of America, Ohio has one of the worst neonatal and perinatal death rates. Within Ohio, Montgomery County has an above average neonatal and perinatal death rate. This statistic can be lowered if more women in Montgomery County have health insurance. They would be more likely to seek out prenatal health care, since they would no longer have to pay as much money out-of-pocket. This would allow medical professionals to be able to diagnose and treat any potential issues in the mother or child earlier. Having health insurance would also prevent mothers-to-be from seeking out other potentially …
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Theses and Dissertations--Statistics
Comparing the distribution of biomarker measurements between two groups under either an unpaired or paired design is a common goal in many biomarker studies. However, analyzing biomarker data is sometimes challenging because the data may not be normally distributed and contain a large fraction of zero values or missing values. Although several statistical methods have been proposed, they either require data normality assumption, or are inefficient. We proposed a novel two-part semiparametric method for data under an unpaired setting and a nonparametric method for data under a paired setting. The semiparametric method considers a two-part model, a logistic regression for …
Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu
Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu
Theses and Dissertations--Statistics
The Bayesian adjustment for confounding (BAC) is a Bayesian model averaging method to select and adjust for confounding factors when evaluating the average causal effect of an exposure on a certain outcome. We extend the BAC method to time-to-event outcomes. Specifically, the posterior distribution of the exposure effect on a time-to-event outcome is calculated as a weighted average of posterior distributions from a number of candidate proportional hazards models, weighing each model by its ability to adjust for confounding factors. The Bayesian Information Criterion based on the partial likelihood is used to compare different models and approximate the Bayes factor. …
Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee
Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee
Theses and Dissertations
Within-person data can exhibit a virtually limitless variety of statistical patterns, but it can be difficult to distinguish meaningful features from statistical artifacts. Studies of complex traits have previously used genetic signals like twin-based heritability to distinguish between the two. This dissertation is a collection of studies applying state-space modeling to conceptualize and estimate novel phenotypic constructs for use in psychiatric research and further biometrical genetic analysis. The aims are to: (1) relate control theoretic concepts to health-related phenotypes; (2) design statistical models that formally define those phenotypes; (3) estimate individual phenotypic values from time series data; (4) consider hierarchical …
Aggregate Loss Model With Poisson-Tweedie Loss Frequency, Si Chen
Aggregate Loss Model With Poisson-Tweedie Loss Frequency, Si Chen
Theses and Dissertations (Comprehensive)
The aggregate loss model has applications in various areas such as financial risk management and actuarial science. The aggregate loss is the summation of all random losses occurred in a period, and it is governed by both the loss severity and the loss frequency. While the impact of the loss severity on aggregate loss is well studied, less focus is paid on the influence of loss frequency on aggregate loss, which motivates our study. In this thesis, we enrich the aggregate loss framework by introducing the Poisson-Tweedie distribution as a candidate for modelling loss frequency, prove the closedness of Poisson-Tweedie …
Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu
Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu
Theses and Dissertations--Statistics
A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …
Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou
Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou
Theses and Dissertations--Statistics
Statistical intervals (e.g., confidence, prediction, or tolerance) are widely used to quantify uncertainty, but complex settings can create challenges to obtain such intervals that possess the desired properties. My thesis will address diverse data settings and approaches that are shown empirically to have good performance. We first introduce a focused treatment on using a single-layer bootstrap calibration to improve the coverage probabilities of two-sided parametric tolerance intervals for non-normal distributions. We then turn to zero-inflated data, which are commonly found in, among other areas, pharmaceutical and quality control applications. However, the inference problem often becomes difficult in the presence of …
Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang
Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang
Theses and Dissertations--Statistics
Kinetic modeling of the time dependence of metabolite concentrations including the unstable isotope labeled species is an important approach to simulate metabolic pathway dynamics. It is also essential for quantitative metabolic flux analysis using tracer data. However, as the metabolic networks are complex including extensive compartmentation and interconnections, the parameter estimation for enzymes that catalyze individual reactions needed for kinetic modeling is challenging. As the pa- rameter space is large and multi-dimensional while kinetic data are comparatively sparse, the estimation procedure (especially the point estimation methods) often en- counters multiple local maximum such that standard maximum likelihood methods may yield …
Modeling The Galactic Compact Binary Neutron Star Population And Studying The Double Pulsar System, Nihan Pol
Modeling The Galactic Compact Binary Neutron Star Population And Studying The Double Pulsar System, Nihan Pol
Graduate Theses, Dissertations, and Problem Reports
Binary neutron star (BNS) systems consisting of at least one neutron star provide an avenue for testing a broad range of physical phenomena ranging from tests of General Relativity to probing magnetospheric physics to understanding the behavior of matter in the densest environments in the Universe. Ultra-compact BNS systems with orbital periods less than few tens of minutes emit gravitational waves with frequencies ~mHz and are detectable by the planned space-based Laser Interferometer Space Antenna (LISA), while merging BNS systems produce a chirping gravitational wave signal that can be detected by the ground-based Laser Interferometer Gravitational-Wave Observatory (LIGO). Thus, BNS …
Projecting Regions Of North Atlantic Right Whale, Eubalaena Glacialis, Habitat Suitability In The Gulf Of Maine In 2050, Camille Ross
Projecting Regions Of North Atlantic Right Whale, Eubalaena Glacialis, Habitat Suitability In The Gulf Of Maine In 2050, Camille Ross
Honors Theses
North Atlantic right whales (Eubalaena glacialis) are endangered. Understanding the role environmental conditions play in habitat suitability is key to determining the regions in need of protection for conservation of the species, particularly as climate change shifts suitable habitat. This thesis uses three species distribution modeling algorithms, together with historical data on whale abundance(1993 to 2009) and environmental covariates to build monthly ensemble models of past E. glacialis habitat suitability in the Gulf of Maine. Then, the models are projected onto the year 2050 for a range of climate scenarios. Specifically, the distribution of the species was modeled …
Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick
Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick
School of Mathematical and Statistical Sciences Faculty Publications and Presentations
AIMS:
Over 1/3 of Americans have prediabetes, while 9.4% have type 2 diabetes. The aim of our study was to estimate the prevalence of prediabetes in Mexican Americans, with known 28.2% prevalence of type 2 diabetes, by age and sex and to identify critical socio-demographic and clinical factors associated with prediabetes.
METHODS:
Data were collected between 2004 and 2017 from the Cameron County Hispanic Cohort in Texas. Weighted crude and sex- and age- stratified prevalences were calculated. Survey weighted logistic regression analyses were conducted to identify risk factors for prediabetes.
RESULTS:
The prevalence of prediabetes (32%) was slightly higher than …
Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, Stuart A. Batterman, Veronica J. Berrocal, Chad Milando, Owais Gilani, Saravanan Arunachalam, K. Max Zhang
Enhancing Models And Measurements Of Traffic-Related Air Pollutants For Health Studies Using Dispersion Modeling And Bayesian Data Fusion, Stuart A. Batterman, Veronica J. Berrocal, Chad Milando, Owais Gilani, Saravanan Arunachalam, K. Max Zhang
Faculty Journal Articles
Research Report 202 describes a study led by Dr. Stuart Batterman at the University of Michigan, Ann Arbor and colleagues. The investigators evaluated the ability to predict traffic-related air pollution using a variety of methods and models, including a line source air pollution dispersion model and sophisticated spatiotemporal Bayesian data fusion methods. Exposure assessment for traffic-related air pollution is challenging because the pollutants are a complex mixture and vary greatly over space and time. Because extensive direct monitoring is difficult and expensive, a number of modeling approaches have been developed, but each model has its own limitations and errors.
Dr. …
How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller
How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller
CMC Senior Theses
In this paper I will be breaking down a scholarly article, written by Sameer K. Deshpande and Shane T. Jensen, that proposed a new method to evaluate NBA players. The NBA is the highest level professional basketball league in America and stands for the National Basketball Association. They proposed to build a model that would result in how NBA players impact their teams chances of winning a game, using machine learning and probability concepts. I preface that by diving into these concepts and their mathematical backgrounds. These concepts include building a linear model using ordinary least squares method, the bias …