Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 16 of 16

Full-Text Articles in Physical Sciences and Mathematics

Evaluating The Efficiency Of Markov Chain Monte Carlo Algorithms, Thuy Scanlon Jul 2021

Evaluating The Efficiency Of Markov Chain Monte Carlo Algorithms, Thuy Scanlon

Graduate Theses and Dissertations

Markov chain Monte Carlo (MCMC) is a simulation technique that produces a Markov chain designed to converge to a stationary distribution. In Bayesian statistics, MCMC is used to obtain samples from a posterior distribution for inference. To ensure the accuracy of estimates using MCMC samples, the convergence to the stationary distribution of an MCMC algorithm has to be checked. As computation time is a resource, optimizing the efficiency of an MCMC algorithm in terms of effective sample size (ESS) per time unit is an important goal for statisticians. In this paper, we use simulation studies to demonstrate how the Gibbs …


Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang Jan 2020

Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang

Theses and Dissertations--Statistics

Kinetic modeling of the time dependence of metabolite concentrations including the unstable isotope labeled species is an important approach to simulate metabolic pathway dynamics. It is also essential for quantitative metabolic flux analysis using tracer data. However, as the metabolic networks are complex including extensive compartmentation and interconnections, the parameter estimation for enzymes that catalyze individual reactions needed for kinetic modeling is challenging. As the pa- rameter space is large and multi-dimensional while kinetic data are comparatively sparse, the estimation procedure (especially the point estimation methods) often en- counters multiple local maximum such that standard maximum likelihood methods may yield …


Inflated Standard Errors Of Mcmc Estimates In Irt, Dongho Shin Apr 2019

Inflated Standard Errors Of Mcmc Estimates In Irt, Dongho Shin

Theses and Dissertations

Two widely used algorithms for estimating item response theory (IRT) parameters are Markov chain Monte Carlo (MCMC) and the EM algorithm. In general, the MCMC algorithm has advantages over the EM algorithm - for example, the MCMC algorithm allows one to estimate the desired posterior distribution and also works more straightforwardly with complex IRT models. This ease of use, allows one to implement the MCMC algorithm without carefully consideration. Previous studies, Hendrix (2011) and Lee (2016), noted that the estimated standard errors from the MCMC algorithm are larger than those from the EM algorithm. Therefore, this study investigate the reason …


Unsupervised Learning In Phylogenomic Analysis Over The Space Of Phylogenetic Trees, Qiwen Kang Jan 2019

Unsupervised Learning In Phylogenomic Analysis Over The Space Of Phylogenetic Trees, Qiwen Kang

Theses and Dissertations--Statistics

A phylogenetic tree is a tree to represent an evolutionary history between species or other entities. Phylogenomics is a new field intersecting phylogenetics and genomics and it is well-known that we need statistical learning methods to handle and analyze a large amount of data which can be generated relatively cheaply with new technologies. Based on the existing Markov models, we introduce a new method, CURatio, to identify outliers in a given gene data set. This method, intrinsically an unsupervised method, can find outliers from thousands or even more genes. This ability to analyze large amounts of genes (even with missing …


Hierarchical Bayesian Regression With Application In Spatial Modeling And Outlier Detection, Ghadeer Mahdi May 2018

Hierarchical Bayesian Regression With Application In Spatial Modeling And Outlier Detection, Ghadeer Mahdi

Graduate Theses and Dissertations

This dissertation makes two important contributions to the development of Bayesian hierarchical models. The first contribution is focused on spatial modeling. Spatial data observed on a group of areal units is common in scientific applications. The usual hierarchical approach for modeling this kind of dataset is to introduce a spatial random effect with an autoregressive prior. However, the usual Markov chain Monte Carlo scheme for this hierarchical framework requires the spatial effects to be sampled from their full conditional posteriors one-by-one resulting in poor mixing. More importantly, it makes the model computationally inefficient for datasets with large number of units. …


Peptide Identification: Refining A Bayesian Stochastic Model, Theophilus Barnabas Kobina Acquah May 2017

Peptide Identification: Refining A Bayesian Stochastic Model, Theophilus Barnabas Kobina Acquah

Electronic Theses and Dissertations

Notwithstanding the challenges associated with different methods of peptide identification, other methods have been explored over the years. The complexity, size and computational challenges of peptide-based data sets calls for more intrusion into this sphere. By relying on the prior information about the average relative abundances of bond cleavages and the prior probability of any specific amino acid sequence, we refine an already developed Bayesian approach in identifying peptides. The likelihood function is improved by adding additional ions to the model and its size is driven by two overall goodness of fit measures. In the face of the complexities associated …


Robustness Of The Within- And Between-Series Estimators To Non-Normal Multiple-Baseline Studies: A Monte Carlo Study, Seang-Hwane Joo Apr 2017

Robustness Of The Within- And Between-Series Estimators To Non-Normal Multiple-Baseline Studies: A Monte Carlo Study, Seang-Hwane Joo

USF Tampa Graduate Theses and Dissertations

In single-case research, multiple-baseline (MB) design is the most widely used design in practical settings. It provides the opportunity to estimate the treatment effect based on not only within-series comparisons of treatment phase to baseline phase observations, but also time-specific between-series comparisons of observations from those that have started treatment to those that are still in the baseline. In MB studies, the average treatment effect and the variation of these effects across multiple participants can be estimated using various statistical modeling methods. Recently, two types of statistical modeling methods were proposed for analyzing MB studies: a) within-series model and b) …


The Nonparametric Estimation Of Elliptical Distributions, Panfeng Liang Jan 2017

The Nonparametric Estimation Of Elliptical Distributions, Panfeng Liang

Open Access Theses & Dissertations

In practice, many multivariate datasets have identical marginal distributions. Elliptical distributions can be used to model many of those datasets. In this Thesis, we will propose a Bayesian method using Markov chain Monte Carlo (MCMC) methods to estimate the density function underlying multivariate datasets assuming it is an elliptical distribution.


Bayesian Inference Of The Weibull-Pareto Distribution, James Dow Jan 2015

Bayesian Inference Of The Weibull-Pareto Distribution, James Dow

Electronic Theses and Dissertations

The Weibull distribution has many applications in various topics. Some of these topics include survival analysis, reliability engineering, general insurance, electrical engineering, and industrial engineering. The Weibull distribution was further extended by the Weibull-Pareto distribution. A desirable property this distribution has is its shape can skew being able to better model left or right skewed data. Examples of skewed data include human longevity and actuarial data. In this work a hierarchical Bayesian model was developed using the Weibull-Pareto distribution.


A Latent Mixture Approach To Modeling Zero-Inflated Bivariate Ordinal Data, Rajendra Kadel Jan 2013

A Latent Mixture Approach To Modeling Zero-Inflated Bivariate Ordinal Data, Rajendra Kadel

USF Tampa Graduate Theses and Dissertations

Multivariate ordinal response data, such as severity of pain, degree of disability, and satisfaction with a healthcare provider, are prevalent in many areas of research including public health, biomedical, and social science research. Ignoring the multivariate features of the response variables, that is, by not taking the correlation between the errors across models into account, may lead to substantially biased estimates and inference. In addition, such multivariate ordinal outcomes frequently exhibit a high percentage of zeros (zero inflation) at the lower end of the ordinal scales, as compared to what is expected under a multivariate ordinal distribution. Thus, zero inflation …


Hitters Vs. Pitchers: A Comparison Of Fantasy Baseball Player Performances Using Hierarchical Bayesian Models, Scott D. Huddleston Apr 2012

Hitters Vs. Pitchers: A Comparison Of Fantasy Baseball Player Performances Using Hierarchical Bayesian Models, Scott D. Huddleston

Theses and Dissertations

In recent years, fantasy baseball has seen an explosion in popularity. Major League Baseball, with its long, storied history and the enormous quantity of data available, naturally lends itself to the modern-day recreational activity known as fantasy baseball. Fantasy baseball is a game in which participants manage an imaginary roster of real players and compete against one another using those players' real-life statistics to score points. Early forms of fantasy baseball began in the early 1960s, but beginning in the 1990s, the sport was revolutionized due to the advent of powerful computers and the Internet. The data used in this …


Predicting Maximal Oxygen Consumption (Vo2max) Levels In Adolescents, Brent A. Shepherd Mar 2012

Predicting Maximal Oxygen Consumption (Vo2max) Levels In Adolescents, Brent A. Shepherd

Theses and Dissertations

Maximal oxygen consumption (VO2max) is considered by many to be the best overall measure of an individual's cardiovascular health. Collecting the measurement, however, requires subjecting an individual to prolonged periods of intense exercise until their maximal level, the point at which their body uses no additional oxygen from the air despite increased exercise intensity, is reached. Collecting VO2max data also requires expensive equipment and great subject discomfort to get accurate results. Because of this inherent difficulty, it is often avoided despite its usefulness. In this research, we propose a set of Bayesian hierarchical models to predict VO2max levels in adolescents, …


Statistical Estimation Of Physiologically-Based Pharmacokinetic Models: Identifiability, Variation, And Uncertainty With An Illustration Of Chronic Exposure To Dioxin And Dioxin-Like-Compounds., Zachary John Thompson Jan 2012

Statistical Estimation Of Physiologically-Based Pharmacokinetic Models: Identifiability, Variation, And Uncertainty With An Illustration Of Chronic Exposure To Dioxin And Dioxin-Like-Compounds., Zachary John Thompson

USF Tampa Graduate Theses and Dissertations

Assessment of human exposure to environmental chemicals is inherently subject to uncertainty and variability. There are data gaps concerning the inventory, source, duration, and intensity of exposure

as well as knowledge gaps regarding pharmacokinetics in general. These gaps result in uncertainties in exposure assessment.

The uncertainties compound further with variabilities due to population variations regarding stage of life, life style, and susceptibility,

etc. Use of physiologically-based pharmacokinetic (PBPK) models promises to reduce the uncertainties and enhance extrapolation between species, between routes, from high to low dose, and from acute to chronic exposure. However, fitting PBPK models is challenging because of …


Modeling Endogenous Treatment Eects With Heterogeneity: A Bayesian Nonparametric Approach, Xuequn Hu Jan 2011

Modeling Endogenous Treatment Eects With Heterogeneity: A Bayesian Nonparametric Approach, Xuequn Hu

USF Tampa Graduate Theses and Dissertations

This dissertation explores the estimation of endogenous treatment effects in the presence of heterogeneous responses. A Bayesian Nonparametric approach is taken to model the heterogeneity in treatment effects. Specifically, I adopt the Dirichlet Process Mixture (DPM) model to capture the heterogeneity and show that DPM often outperforms Finite Mixture Model (FMM) in providing more flexible function forms and thus better model fit. Rather than fixing the number of components in a mixture model, DPM allows the data and prior knowledge to determine the number of components in the data, thus providing an automatic mechanism for model selection.

Two DPM models …


Bayesian And Frequentist Approaches For The Analysis Of Multiple Endpoints Data Resulting From Exposure To Multiple Health Stressors., Epiphanie Nyirabahizi Mar 2010

Bayesian And Frequentist Approaches For The Analysis Of Multiple Endpoints Data Resulting From Exposure To Multiple Health Stressors., Epiphanie Nyirabahizi

Theses and Dissertations

In risk analysis, Benchmark dose (BMD)methodology is used to quantify the risk associated with exposure to stressors such as environmental chemicals. It consists of fitting a mathematical model to the exposure data and the BMD is the dose expected to result in a pre-specified response or benchmark response (BMR). Most available exposure data are from single chemical exposure, but living objects are exposed to multiple sources of hazards. Furthermore, in some studies, researchers may observe multiple endpoints on one subject. Statistical approaches to address multiple endpoints problem can be partitioned into a dimension reduction group and a dimension preservative group. …


Modeling Transition Probabilities For Loan States Using A Bayesian Hierarchical Model, Rebecca Lee Monson Nov 2007

Modeling Transition Probabilities For Loan States Using A Bayesian Hierarchical Model, Rebecca Lee Monson

Theses and Dissertations

A Markov Chain model can be used to model loan defaults because loans move through delinquency states as the borrower fails to make monthly payments. The transition matrix contains in each location a probability that a borrower in a given state one month moves to the possible delinquency states the next month. In order to use this model, it is necessary to know the transition probabilities, which are unknown quantities. A Bayesian hierarchical model is postulated because there may not be sufficient data for some rare transition probabilities. Using a hierarchical model, similarities between types or families of loans can …