Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 109

Full-Text Articles in Physical Sciences and Mathematics

Model-Based Imputation Of Below Detection Limit Missing Data And Group Selection In Bayesian Group Index Regression, Matthew Carli Jan 2023

Model-Based Imputation Of Below Detection Limit Missing Data And Group Selection In Bayesian Group Index Regression, Matthew Carli

Theses and Dissertations

Investigations into the association between chemical exposure and health outcomes are increasingly focused on the role of chemical mixtures, as opposed to individual chemicals. The analysis of chemical mixture data required the development of novel statistical methods, one of these being Bayesian group index regression. A statistical challenge common to all chemical mixture analyses is the ubiquitous presence of below detection limit (BDL) data. We propose an extension of Bayesian group index regression that treats both regression effects and missing BDL observations as parameters in a model estimated through a Markov Chain Monte Carlo algorithm that we refer to as …


Variability In Causal Effects On A Binary Outcome And Noncompliance In A Multisite Randomized Trial, Xinxin Sun Jan 2023

Variability In Causal Effects On A Binary Outcome And Noncompliance In A Multisite Randomized Trial, Xinxin Sun

Theses and Dissertations

Noncompliance to treatment assignment is widespread in randomized trials and presents challenges in causal inference. In the presence of noncompliance, the most commonly estimated effect of treatment assignment, also known as intent-to-treat (ITT) effect, is biased. Of interest in this setting is the complier average causal effect (CACE), the ITT effect among compliers. Further complication arises when the outcome variable is partially observed.

My research focuses on estimating the distribution of a site-specific CACE in a multisite randomized controlled trial (MRCT) by maximum likelihood (ML). Assuming compliance missing at random (MAR). We express the likelihood as an integral with respect …


Estimating Weighted Panel Sizes For Primary Care Providers: An Assessment Of Clustering And Novel Methods Of Panel Size Estimation On Electronic Medical Records, Martin A. Lavallee Jan 2022

Estimating Weighted Panel Sizes For Primary Care Providers: An Assessment Of Clustering And Novel Methods Of Panel Size Estimation On Electronic Medical Records, Martin A. Lavallee

Theses and Dissertations

Primary Care is on the frontlines of healthcare, thus they see the most diverse set of patients. In order to achieve high functioning primary care, a practice must establish empanelment, the pairing of patients to providers. Enumeration of empanelment, or estimating panel sizes, helps ensure that the demands of the patients demand the supply of providers and optimize the balance of primary care resources to improve quality of care. Further we can adjust panel sizes by using patient-level data on healthcare utilization and complexity extracted from the electronic medial record to determine the amount of care or burden of work …


Estimating The Statistics Of Operational Loss Through The Analyzation Of A Time Series, Maurice L. Brown Jan 2022

Estimating The Statistics Of Operational Loss Through The Analyzation Of A Time Series, Maurice L. Brown

Theses and Dissertations

In the world of finance, appropriately understanding risk is key to success or failure because it is a fundamental driver for institutional behavior. Here we focus on risk as it relates to the operations of financial institutions, namely operational risk. Quantifying operational risk begins with data in the form of a time series of realized losses, which can occur for a number of reasons, can vary over different time intervals, and can pose a challenge that is exacerbated by having to account for both frequency and severity of losses. We introduce a stochastic point process model for the frequency distribution …


Approximating Bayesian Optimal Sequential Designs Using Gaussian Process Models Indexed On Belief States, Joseph Burris Jan 2022

Approximating Bayesian Optimal Sequential Designs Using Gaussian Process Models Indexed On Belief States, Joseph Burris

Theses and Dissertations

Fully sequential optimal Bayesian experimentation can offer greater utility than both traditional Bayesian designs and greedy sequential methods, but practically cannot be solved due to numerical complexity and continuous outcome spaces. Approximate solutions can be found via approximate dynamic programming, but rely on surrogate models of the expected utility at each trial of the experiment with hand-chosen features or use methods which ignore the underlying geometry of the space of probability distributions. We propose the use of Gaussian process models indexed on the belief states visited in experimentation to provide utility-agnostic surrogate models for approximating Bayesian optimal sequential designs which …


Role Of Inhibition And Spiking Variability In Ortho- And Retronasal Olfactory Processing, Michelle F. Craft Jan 2022

Role Of Inhibition And Spiking Variability In Ortho- And Retronasal Olfactory Processing, Michelle F. Craft

Theses and Dissertations

Odor perception is the impetus for important animal behaviors, most pertinently for feeding, but also for mating and communication. There are two predominate modes of odor processing: odors pass through the front of nose (ortho) while inhaling and sniffing, or through the rear (retro) during exhalation and while eating and drinking. Despite the importance of olfaction for an animal’s well-being and specifically that ortho and retro naturally occur, it is unknown whether the modality (ortho versus retro) is transmitted to cortical brain regions, which could significantly instruct how odors are processed. Prior imaging studies show different …


Improving College Students’ Views And Beliefs Relative To Mathematics: A Systematic Literature Review Followed By A Multiple Case Mixed Methods Exploration Of The Experiences That Underpin Community College Students’ Attitudes, Self-Efficacy, And Values In Mathematics, Marquita H. Sea Jan 2022

Improving College Students’ Views And Beliefs Relative To Mathematics: A Systematic Literature Review Followed By A Multiple Case Mixed Methods Exploration Of The Experiences That Underpin Community College Students’ Attitudes, Self-Efficacy, And Values In Mathematics, Marquita H. Sea

Theses and Dissertations

Mathematics is particularly important due to its relevance in our daily lives. It is a general requirement throughout schooling. Unfortunately, many students openly declare negative views/beliefs regarding math in their personal and academic lives. These in turn, negatively influence students’ achievement related behaviors and outcomes. First, a systematic literature review was conducted to determine what types of studies/initiatives have aimed to enhance students’ views/beliefs relative to mathematics, including domain general and specific perceptions of math as well as their judgements of who is successful in mathematics and if they themselves can be successful. Specifically, the review centered on the components …


Parametric, Nonparametric, And Semiparametric Linear Regression In Classical And Bayesian Statistical Quality Control, Chelsea L. Jones Jan 2021

Parametric, Nonparametric, And Semiparametric Linear Regression In Classical And Bayesian Statistical Quality Control, Chelsea L. Jones

Theses and Dissertations

Statistical process control (SPC) is used in many fields to understand and monitor desired processes, such as manufacturing, public health, and network traffic. SPC is categorized into two phases; in Phase I historical data is used to inform parameter estimates for a statistical model and Phase II implements this statistical model to monitor a live ongoing process. Within both phases, profile monitoring is a method to understand the functional relationship between response and explanatory variables by estimating and tracking its parameters. In profile monitoring, control charts are often used as graphical tools to visually observe process behaviors. We construct a …


Methods For Developing A Machine Learning Framework For Precise 3d Domain Boundary Prediction At Base-Level Resolution, Spiro C. Stilianoudakis Jan 2021

Methods For Developing A Machine Learning Framework For Precise 3d Domain Boundary Prediction At Base-Level Resolution, Spiro C. Stilianoudakis

Theses and Dissertations

High-throughput chromosome conformation capture technology (Hi-C) has revealed extensive DNA looping and folding into discrete 3D domains. These include Topologically Associating Domains (TADs) and chromatin loops, the 3D domains critical for cellular processes like gene regulation and cell differentiation. The relatively low resolution of Hi-C data (regions of several kilobases in size) prevents precise mapping of domain boundaries by conventional TAD/loop-callers. However, high resolution genomic annotations associated with boundaries, such as CTCF and members of cohesin complex, suggest a computational approach for precise location of domain boundaries.

We developed preciseTAD, an optimized machine learning framework that leverages a random …


Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao Jan 2021

Statistical Approaches For Estimation And Comparison Of Brain Functional Connectivity, Jifang Zhao

Theses and Dissertations

Drug addiction can lead to many health-related problems and social concerns. Functional connectivity obtained from functional magnetic resonance imaging (fMRI) data promotes a variety of fundamental understandings in such association. Due to its complex correlation structure and large dimensionality, the modeling and analysis of the functional connectivity from neuroimage are challenging. By proposing a spatio-temporal model for multi-subject neuroimage data, we incorporate voxel-level spatio-temporal dependencies of whole-brain measurements to improve the accuracy of statistical inference. To tackle large-scale spatio-temporal neuroimage data, we develop a computationally efficient algorithm to estimate the parameters. Our method is used to identify functional connectivity and …


Topics In Design And Analysis Of Experiments: Calibration, Sequential Experimentation, And Model Selection, Christine Miller Jan 2021

Topics In Design And Analysis Of Experiments: Calibration, Sequential Experimentation, And Model Selection, Christine Miller

Theses and Dissertations

Experiments are widely used across multiple disciplines to uncover information about a system or processes. Experimental design is a statistical technique devoted to the methodology of selecting the appropriate samples to aid in the subsequent analysis. We research three open problems in experimental designs regarding calibration, sequential experimentation, and model selection. First, we focus on calibration; the impact of experimental design choice on the performance of statistical calibration is largely unknown. We investigate the performance of several experimental designs with regards to inverse prediction via a comprehensive simulation study. Specifically, we compare several design types including traditional response surface designs, …


Bayesian Techniques For Relating Genetic Polymorphisms To Diffusion Tensor Images Of Cocaine Users, Tmader Alballa Jan 2021

Bayesian Techniques For Relating Genetic Polymorphisms To Diffusion Tensor Images Of Cocaine Users, Tmader Alballa

Theses and Dissertations

Past investigations utilizing Diffusion Tensor Imaging (DTI) have demonstrated that cocaine use disorder (CUD) yields white matter changes. We proposed three Bayesian techniques in order to explore the relationship between Fractional Anisotropy (FA), genetic data, and years of cocaine use (YCU). CUD participants exhibit abnormality in different areas of the brain versus non-drug using controls, which is measured by DTI. This dissertation is motivated by a neuroimaging genetic study in cocaine dependence, which found that there were relationships between several genes such as GAD and 5-HT2R and CUD subjects.

In the first chapter, there is background on the …


Bayesian Experimental Design For Bayesian Hierarchical Models With Differential Equations For Ecological Applications, Rebecca Atanga Jan 2021

Bayesian Experimental Design For Bayesian Hierarchical Models With Differential Equations For Ecological Applications, Rebecca Atanga

Theses and Dissertations

Ecologists are interested in the composition of species in various ecosystems. Studying population dynamics can assist environmental managers in making better decisions for the environment. Traditionally, the sampling of species has been recorded on a regular time frequency. However, sampling can be an expensive process due to financial and physical constraints. In some cases the environments are threatening, and ecologists prefer to limit their time collecting data in the field. Rather than convenience sampling, a statistical approach is introduced to improve data collection methods for ecologists by studying the dynamics associated with populations of interest. Population models including the logistic …


The Effect Of Time And Temperature On The Quality Of Latent Fingerprints On Incandescent Lightbulbs, Varying Donors Age And Sex, Kinaysha M. Collazo Maldonado Jan 2020

The Effect Of Time And Temperature On The Quality Of Latent Fingerprints On Incandescent Lightbulbs, Varying Donors Age And Sex, Kinaysha M. Collazo Maldonado

Theses and Dissertations

Fingerprints are used as a means of identification, but there are no established methodologies to determine time since deposition of latent fingerprints by visual means alone. This research considered the influence of age and sex on the quality of recovered latent prints from lit and unlit lightbulbs from 1 to 10 days, using accumulated degree hours (ADH) to account for both heat and time simultaneously. Two male and two female donors (one of each aged <40 and >40 years) were used. A thermal imaging camera was used to monitor the lightbulbs top and middle regions, which were significantly different (p≤0.05) for the …


The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling Jan 2020

The Analysis Of Neural Heterogeneity Through Mathematical And Statistical Methods, Kyle Wendling

Theses and Dissertations

Diversity of intrinsic neural attributes and network connections is known to exist in many areas of the brain and is thought to significantly affect neural coding. Recent theoretical and experimental work has argued that in uncoupled networks, coding is most accurate at intermediate levels of heterogeneity. I explore this phenomenon through two distinct approaches: a theoretical mathematical modeling approach and a data-driven statistical modeling approach.

Through the mathematical approach, I examine firing rate heterogeneity in a feedforward network of stochastic neural oscillators utilizing a high-dimensional model. The firing rate heterogeneity stems from two sources: intrinsic (different individual cells) and network …


Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero Jan 2020

Zero-Inflated Longitudinal Mixture Model For Stochastic Radiographic Lung Compositional Change Following Radiotherapy Of Lung Cancer, Viviana A. Rodríguez Romero

Theses and Dissertations

Compositional data (CD) is mostly analyzed as relative data, using ratios of components, and log-ratio transformations to be able to use known multivariable statistical methods. Therefore, CD where some components equal zero represent a problem. Furthermore, when the data is measured longitudinally, observations are spatially related and appear to come from a mixture population, the analysis becomes highly complex. For this matter, a two-part model was proposed to deal with structural zeros in longitudinal CD using a mixed-effects model. Furthermore, the model has been extended to the case where the non-zero components of the vector might a two component mixture …


Applications Of Dynamic Linear Models To Random Allocation Models, Albert H. Lee Iii Jan 2020

Applications Of Dynamic Linear Models To Random Allocation Models, Albert H. Lee Iii

Theses and Dissertations

Although advances in modern computational algorithms have provided researchers the ability to work problems which were once too computationally complex to solve, problems with high computation or large parameter spaces still remain. Problems such as those involving Time Series can be such problems. Chapter 1 looks at the the use of Exponentially Weighted Moving Averages developed by \citep{holt2004forecasting, winters1960forecasting} which were thought to provide sufficient solutions to these Time Series. A discussion is provided which illustrates the shortcomings of the EWMA and how its infinite number of possible starting values provides the modeler with an endless number of possible solutions …


Utilizing Design Structure For Improving Design Selection And Analysis, Ahlam Ali Alzharani Jan 2020

Utilizing Design Structure For Improving Design Selection And Analysis, Ahlam Ali Alzharani

Theses and Dissertations

Recent work has shown that the structure for design plays a role in the simplicity or complexity of data analysis. To increase the knowledge of research in these areas, this dissertation aims to utilize design structure for improving design selection and analysis. In this regard, minimal dependent sets and block diagonal structure are both important concepts that are relevant to the orthogonality of the columns of a design. We are interested in finding ways to improve the data analysis especially for active effect detection by utilizing minimal dependent sets and block diagonal structure for design.

We introduce a new classification …


Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee Jan 2020

Phenotype Extraction: Estimation And Biometrical Genetic Analysis Of Individual Dynamics, Kevin L. Mckee

Theses and Dissertations

Within-person data can exhibit a virtually limitless variety of statistical patterns, but it can be difficult to distinguish meaningful features from statistical artifacts. Studies of complex traits have previously used genetic signals like twin-based heritability to distinguish between the two. This dissertation is a collection of studies applying state-space modeling to conceptualize and estimate novel phenotypic constructs for use in psychiatric research and further biometrical genetic analysis. The aims are to: (1) relate control theoretic concepts to health-related phenotypes; (2) design statistical models that formally define those phenotypes; (3) estimate individual phenotypic values from time series data; (4) consider hierarchical …


Site- And Location-Adjusted Approaches To Adaptive Allocation Clinical Trial Designs, Brian S. Di Pace Jan 2019

Site- And Location-Adjusted Approaches To Adaptive Allocation Clinical Trial Designs, Brian S. Di Pace

Theses and Dissertations

Response-Adaptive (RA) designs are used to adaptively allocate patients in clinical trials. These methods have been generalized to include Covariate-Adjusted Response-Adaptive (CARA) designs, which adjust treatment assignments for a set of covariates while maintaining features of the RA designs. Challenges may arise in multi-center trials if differential treatment responses and/or effects among sites exist. We propose Site-Adjusted Response-Adaptive (SARA) approaches to account for inter-center variability in treatment response and/or effectiveness, including either a fixed site effect or both random site and treatment-by-site interaction effects to calculate conditional probabilities. These success probabilities are used to update assignment probabilities for allocating patients …


Assessing The Impact Of Incorporating Residential Histories Into The Spatial Analysis Of Cancer Risk, Anny-Claude Joseph Jan 2019

Assessing The Impact Of Incorporating Residential Histories Into The Spatial Analysis Of Cancer Risk, Anny-Claude Joseph

Theses and Dissertations

In many spatial epidemiologic studies, investigators use residential location at diagnosis as a surrogate for unknown environmental exposures or as a geographic basis for assigning measured exposures. Inherently, they make assumptions about the timing and location of pertinent exposures which may prove problematic when studying long latency diseases such as cancer.

In this work we explored how the association between environmental exposures and disease risk for long-latency health outcomes like cancer is affected by residential mobility. We used simulation studies conditioned on real data to evaluate the extent to which the commonly held assumption of no residential mobility 1) affected …


Spectral Methods For The Detection And Characterization Of Topologically Associated Domains, Kellen Garrison Cresswell Jan 2019

Spectral Methods For The Detection And Characterization Of Topologically Associated Domains, Kellen Garrison Cresswell

Theses and Dissertations

The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relatively stable across cell-lines and even across species. These TADs dynamically reorganize during development of disease, and exhibit cell- and conditionspecific differences. Identifying such hierarchical structures and how they change between conditions is a critical step in understanding genome regulation and disease development. Despite their importance, there are relatively few tools for identification of TADs and even fewer for …


Statistical Designs For Network A/B Testing, Victoria V. Pokhilko Jan 2019

Statistical Designs For Network A/B Testing, Victoria V. Pokhilko

Theses and Dissertations

A/B testing refers to the statistical procedure of experimental design and analysis to compare two treatments, A and B, applied to different testing subjects. It is widely used by technology companies such as Facebook, LinkedIn, and Netflix, to compare different algorithms, web-designs, and other online products and services. The subjects participating in these online A/B testing experiments are users who are connected in different scales of social networks. Two connected subjects are similar in terms of their social behaviors, education and financial background, and other demographic aspects. Hence, it is only natural to assume that their reactions to online products …


Methods For Evaluating Dropout Attrition In Survey Data, Camille J. Hochheimer Jan 2019

Methods For Evaluating Dropout Attrition In Survey Data, Camille J. Hochheimer

Theses and Dissertations

As researchers increasingly use web-based surveys, the ease of dropping out in the online setting is a growing issue in ensuring data quality. One theory is that dropout or attrition occurs in phases that can be generalized to phases of high dropout and phases of stable use. In order to detect these phases, several methods are explored. First, existing methods and user-specified thresholds are applied to survey data where significant changes in the dropout rate between two questions is interpreted as the start or end of a high dropout phase. Next, survey dropout is considered as a time-to-event outcome and …


Methods For Joint Normalization And Comparison Of Hi-C Data, John C. Stansfield Jan 2019

Methods For Joint Normalization And Comparison Of Hi-C Data, John C. Stansfield

Theses and Dissertations

The development of chromatin conformation capture technology has opened new avenues of study into the 3D structure and function of the genome. Chromatin structure is known to influence gene regulation, and differences in structure are now emerging as a mechanism of regulation between, e.g., cell differentiation and disease vs. normal states. Hi-C sequencing technology now provides a way to study the 3D interactions of the chromatin over the whole genome. However, like all sequencing technologies, Hi-C suffers from several forms of bias stemming from both the technology and the DNA sequence itself. Several normalization methods have been developed for normalizing …


Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna Jan 2019

Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna

Theses and Dissertations

Widely effective treatment for alcohol use disorder is not yet available, because the exact biological mechanisms that underlie this disorder are not completely understood. One way to gain a better understanding of these mechanisms is to examine the genetic frameworks that contribute to the risk for developing this disorder. This dissertation examines genetic association data in combination with gene expression networks in the brain to identify functional groups of genes associated with alcohol consumption and dependence.

The first study took advantage of the behavioral complexity of human samples, and experimental capabilities provided by mouse models, by co-analyzing gene expression networks …


Bayesian Nonparametric Analysis Of Longitudinal Data With Non-Ignorable Non-Monotone Missingness, Yu Cao Jan 2019

Bayesian Nonparametric Analysis Of Longitudinal Data With Non-Ignorable Non-Monotone Missingness, Yu Cao

Theses and Dissertations

In longitudinal studies, outcomes are measured repeatedly over time, but in reality clinical studies are full of missing data points of monotone and non-monotone nature. Often this missingness is related to the unobserved data so that it is non-ignorable. In such context, pattern-mixture model (PMM) is one popular tool to analyze the joint distribution of outcome and missingness patterns. Then the unobserved outcomes are imputed using the distribution of observed outcomes, conditioned on missing patterns. However, the existing methods suffer from model identification issues if data is sparse in specific missing patterns, which is very likely to happen with a …


Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang Jan 2018

Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang

Theses and Dissertations

Modern big data often emerge as tensors. Standard statistical methods are inadequate to deal with datasets of large volume, high dimensionality, and complex structure. Therefore, it is important to develop algorithms such as low-rank tensor decomposition for data compression, dimensionality reduction, and approximation.

With the advancement in technology, high-dimensional images are becoming ubiquitous in the medical field. In lung radiation therapy, the respiratory motion of the lung introduces variabilities during treatment as the tumor inside the lung is moving, which brings challenges to the precise delivery of radiation to the tumor. Several approaches to quantifying this uncertainty propose using a …


Examining The Confirmatory Tetrad Analysis (Cta) As A Solution Of The Inadequacy Of Traditional Structural Equation Modeling (Sem) Fit Indices, Hangcheng Liu Jan 2018

Examining The Confirmatory Tetrad Analysis (Cta) As A Solution Of The Inadequacy Of Traditional Structural Equation Modeling (Sem) Fit Indices, Hangcheng Liu

Theses and Dissertations

Structural Equation Modeling (SEM) is a framework of statistical methods that allows us to represent complex relationships between variables. SEM is widely used in economics, genetics and the behavioral sciences (e.g. psychology, psychobiology, sociology and medicine). Model complexity is defined as a model’s ability to fit different data patterns and it plays an important role in model selection when applying SEM. As in linear regression, the number of free model parameters is typically used in traditional SEM model fit indices as a measure of the model complexity. However, only using number of free model parameters to indicate SEM model complexity …


Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry Jan 2018

Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry

Theses and Dissertations

The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and …