Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Discipline
Institution
Keyword
Publication Year
Publication

Articles 1 - 30 of 714

Full-Text Articles in Applied Statistics

A Causal Inference Approach For Spike Train Interactions, Zach Saccomano Feb 2024

A Causal Inference Approach For Spike Train Interactions, Zach Saccomano

Dissertations, Theses, and Capstone Projects

Since the 1960s, neuroscientists have worked on the problem of estimating synaptic properties, such as connectivity and strength, from simultaneously recorded spike trains. Recent years have seen renewed interest in the problem coinciding with rapid advances in experimental technologies, including an approximate exponential increase in the number of neurons that can be recorded in parallel and perturbation techniques such as optogenetics that can be used to calibrate and validate causal hypotheses about functional connectivity. This thesis presents a mathematical examination of synaptic inference from two perspectives: (1) using in vivo data and biophysical models, we ask in what cases the …


Statistical Consulting In Academia: A Review, Ke Xiao Jan 2024

Statistical Consulting In Academia: A Review, Ke Xiao

Major Papers

This paper reviews the state of statistical consulting in academia by performing a literature review on this topic in chapters 1 and 2. Chapter 1 overviews general aspects of statistical consulting and types of centers that conduct such services in academia. In Chapter 2 we summarise the literature about the common logistics and processes for conducting statistical consulting in academia. In Chapters 3 and 4, we analyze data on statistical consulting centers for the largest 100 universities in the USA. We also review the literature on the future of statistical consulting in academia in the era of big data and …


Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen Jan 2024

Multiscale Modelling Of Brain Networks And The Analysis Of Dynamic Processes In Neurodegenerative Disorders, Hina Shaheen

Theses and Dissertations (Comprehensive)

The complex nature of the human brain, with its intricate organic structure and multiscale spatio-temporal characteristics ranging from synapses to the entire brain, presents a major obstacle in brain modelling. Capturing this complexity poses a significant challenge for researchers. The complex interplay of coupled multiphysics and biochemical activities within this intricate system shapes the brain's capacity, functioning within a structure-function relationship that necessitates a specific mathematical framework. Advanced mathematical modelling approaches that incorporate the coupling of brain networks and the analysis of dynamic processes are essential for advancing therapeutic strategies aimed at treating neurodegenerative diseases (NDDs), which afflict millions of …


Applications Of Independent And Identically Distributed (Iid) Random Processes In Polarimetry And Climatology, Dan Kestner Jan 2024

Applications Of Independent And Identically Distributed (Iid) Random Processes In Polarimetry And Climatology, Dan Kestner

Dissertations, Master's Theses and Master's Reports

The unifying theme of this thesis is the characterization of “perfect randomness,” i.e., independent and identically distributed (IID) stochastic processes as these are applied in physical science. Two specific and mathematically distinct applications are chosen: (i) Radar and optical polarimetry; (ii) Analysis of time series in meteorology. In (i), IID process of a special kind, namely, with a distribution defined by symmetry, is used to link its multivariate Gaussian density to uniformity on the Poincaré sphere. This “statistical ellipsometry” approach is then used to relate polarimetric mismatches or imbalances to ellipsometric variables and suitably chosen cross-correlation measures. In (ii), recently …


Exploration And Statistical Modeling Of Profit, Caleb Gibson Dec 2023

Exploration And Statistical Modeling Of Profit, Caleb Gibson

Undergraduate Honors Theses

For any company involved in sales, maximization of profit is the driving force that guides all decision-making. Many factors can influence how profitable a company can be, including external factors like changes in inflation or consumer demand or internal factors like pricing and product cost. Understanding specific trends in one's own internal data, a company can readily identify problem areas or potential growth opportunities to help increase profitability.

In this discussion, we use an extensive data set to examine how a company might analyze their own data to identify potential changes the company might investigate to drive better performance. Based …


Analyses Of Effect Indices Across Single-Case Research Designs In Counseling, Cian L. Brown Dec 2023

Analyses Of Effect Indices Across Single-Case Research Designs In Counseling, Cian L. Brown

Graduate Theses and Dissertations

Single case research design (SCRD) is a common methodology used across clinical disciplines to determine treatments effectiveness by comparing treatment conditions to baseline conditions in individual cases, usually among researchers working with smaller samples. Although popular within behavioral disciplines such as special education and behavioral analysis, studies have begun to emerge in counseling. However, guidance and current understanding of the use of SCRD in counseling is limited. A content analysis of counseling journals from 2003 to 2014 yielded only 7 studies using SCRD. In 2015, the flagship counseling journal, Journal of Counseling and Development, published a special issue on the …


Foundations Of Memory Capacity In Models Of Neural Cognition, Chandradeep Chowdhury Dec 2023

Foundations Of Memory Capacity In Models Of Neural Cognition, Chandradeep Chowdhury

Master's Theses

A central problem in neuroscience is to understand how memories are formed as a result of the activities of neurons. Valiant’s neuroidal model attempted to address this question by modeling the brain as a random graph and memories as subgraphs within that graph. However the question of memory capacity within that model has not been explored: how many memories can the brain hold? Valiant introduced the concept of interference between memories as the defining factor for capacity; excessive interference signals the model has reached capacity. Since then, exploration of capacity has been limited, but recent investigations have delved into the …


Bayesian Learning Of Spatiotemporal Source Distribution For Beached Microplastic In The Gulf Of Mexico, David Pojunas Dec 2023

Bayesian Learning Of Spatiotemporal Source Distribution For Beached Microplastic In The Gulf Of Mexico, David Pojunas

Graduate Theses and Dissertations

Over the last several decades, plastic waste has gradually accumulated while slowly degrading in terrestrial and oceanic environments. Recently, there has been an increased effort to identify the possible sources of plastic to understand how they affect vulnerable beaches. This issue is of particular concern in the Gulf of Mexico due to the presence of oil, natural gas, and plastic production. In this thesis, we expand upon existing Bayesian plastic attribution models and develop a rigorous statistical framework to map observed beached microplastics to their sources. Within this framework, we combine Lagrangian backtracking simulations of floating particles using nurdle beaching …


Comparative Analysis Of Teacher Effects Parameters In Models Used For Assessing School Effectiveness: Value-Added Models & Persistence, Merlin J. Kamgue Dec 2023

Comparative Analysis Of Teacher Effects Parameters In Models Used For Assessing School Effectiveness: Value-Added Models & Persistence, Merlin J. Kamgue

Graduate Theses and Dissertations

Longitudinal measures for students have become increasingly popular to estimate the effects of individual teachers and schools. Value-added models are one of the approaches using longitudinal data to evaluate teachers and schools. In the value-added model (VAM) literature, many statistical approaches have been developed and used to estimate teacher or school effects on student learning. This study opted to use a Bayesian multivariate model for evaluating teacher effects. The generalized persistence models can handle longitudinal data, not vertically scaled, allowing for a below-par teacher’s effects correlation across test administrations. This study first generated longitudinal students’ test score data and used …


Parameter Estimation For Patient Enrollment In Clinical Trials, Junyan Liu Dec 2023

Parameter Estimation For Patient Enrollment In Clinical Trials, Junyan Liu

Undergraduate Honors Theses

In this paper, we study the Poisson-gamma model for recruitment time in clinical trials. We proved several properties of this model that match our intuitions from a reliability perspective, did simulations on this model, and used different optimization methods to estimate the parameters. Although the behaviors of the optimization methods were unfavorable and unstable, we identified certain conditions and provided potential explanations for this phenomenon and further insights into the Poisson-gamma model.


Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako Nov 2023

Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako

Doctoral Dissertations

This dissertation is in the field of Nonparametric Derivative Estimation using
Penalized Splines. It is conducted in two parts. In the first part, we study the L2
convergence rates of estimating derivatives of mean regression functions using penalized splines. In 1982, Stone provided the optimal rates of convergence for estimating derivatives of mean regression functions using nonparametric methods. Using these rates, Zhou et. al. in their 2000 paper showed that the MSE of derivative estimators based on regression splines approach zero at the optimal rate of convergence. Also, in 2019, Xiao showed that, under some general conditions, penalized spline estimators …


The Use Of Regularization To Detect Racial Inequities In Pay Equity Studies: An Empirical Study And Reflections On Regulation Methods, Christopher M. Peña Nov 2023

The Use Of Regularization To Detect Racial Inequities In Pay Equity Studies: An Empirical Study And Reflections On Regulation Methods, Christopher M. Peña

Electronic Theses and Dissertations

Since the late 1970s, multiple linear regression has been the preferred method for identifying discrimination in pay. An empirical study on this topic was conducted using quantitative critical methods. A literature review first examined conflicting views on using multiple linear regression in pay equity studies. The review found that multiple linear regression is used so prevalently in pay equity studies because the courts and practitioners have widely accepted it and because of its simplicity and ability to parse multiple sources of variance simultaneously. Commentaries in the literature cautioned about errors in model specification, the use of tainted variables, and the …


Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang Oct 2023

Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang

Statistical Science Theses and Dissertations

Spatially resolved transcriptomics (SRT) quantifies expression levels at different spatial locations, providing a new and powerful tool to investigate novel biological insights. As experimental technologies enhance both in capacity and efficiency, there arises a growing demand for the development of analytical methodologies.

One question in SRT data analysis is to identify genes whose expressions exhibit spatially correlated patterns, called spatially variable (SV) genes. Most current methods to identify SV genes are built upon the geostatistical model with Gaussian process, which could limit the models' ability to identify complex spatial patterns. In order to overcome this challenge and capture more types …


A New Method To Determine The Posterior Distribution Of Coefficient Alpha, John Mart V. Delosreyes Oct 2023

A New Method To Determine The Posterior Distribution Of Coefficient Alpha, John Mart V. Delosreyes

Psychology Theses & Dissertations

There is a focus within the behavioral/social sciences on non-physical, psychological constructs (i.e., constructs). These constructs are indirectly measured using measurement instruments that consist of questions that capture the manifestations of these constructs. The indirect nature of measuring constructs results in a need of ensuring that measurement instruments are reliable. The most popular statistic used to estimate reliability is coefficient alpha as it is easy to compute and has properties that make it desirable to use. Coefficient alpha’s popularity has resulted in a wide breadth of research into its qualities. Notably, research about coefficient alpha’s distribution has led to developments …


Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi Sep 2023

Parameter Estimation For Normally Distributed Grouped Data And Clustering Single-Cell Rna Sequencing Data Via The Expectation-Maximization Algorithm, Zahra Aghahosseinalishirazi

Electronic Thesis and Dissertation Repository

The Expectation-Maximization (EM) algorithm is an iterative algorithm for finding the maximum likelihood estimates in problems involving missing data or latent variables. The EM algorithm can be applied to problems consisting of evidently incomplete data or missingness situations, such as truncated distributions, censored or grouped observations, and also to problems in which the missingness of the data is not natural or evident, such as mixed-effects models, mixture models, log-linear models, and latent variables. In Chapter 2 of this thesis, we apply the EM algorithm to grouped data, a problem in which incomplete data are evident. Nowadays, data confidentiality is of …


Comparing Elevator Strategies For A Parking Lot, Naveed Arafat Aug 2023

Comparing Elevator Strategies For A Parking Lot, Naveed Arafat

Major Papers

In this paper, we compare elevator strategies for a parking garage. It is assumed that the parking garage has several floors and there is an elevator which can stop on each floor. We begin by considering 4 strategies detailed in page 23. For each strategy, we loop the program 100 times, and get 100 mean values for wait times. Welch's test confirms highly significant differences among the 4 strategies. Repeating the analysis multiple times we see that the best of the 4 strategies is strategy 2, which places the elevator on floor 2 (the median floor) after use.


Excess Zeros Under Gam: Tweedie Or Two-Part?, Xianming Zeng Aug 2023

Excess Zeros Under Gam: Tweedie Or Two-Part?, Xianming Zeng

Major Papers

Positive, right-skewed data with excess zeros are encountered in many real-life situations. Two possible techniques to analyze this type of data are: Two-part models and Tweedie models. The two-part models assume existence of a separate zero generating process, while the Tweedie models are based on distributions that allow mass at zero. The paper aims to present a simulation study to investigate the performance of Generalized Additive Models (GAM) under the distribution of Tweedie and two-part models for such data with excess zero by using MSE (Mean Square Error) and relative bias to compare the performance of both methods. We found …


Probabilistic Modeling Of Social Media Networks, Distinguishing Phylogenetic Networks From Trees, And Fairness In Service Queues, Md Rashidul Hasan Aug 2023

Probabilistic Modeling Of Social Media Networks, Distinguishing Phylogenetic Networks From Trees, And Fairness In Service Queues, Md Rashidul Hasan

Mathematics & Statistics ETDs

In this dissertation, three primary issues are explored. The first subject exposes who-saw-from-whom pathways in post-specific dissemination networks in social media platforms. We describe a network-based approach for temporal, textual, and post-diffusion network inference. The conditional point process method discovers the most probable diffusion network. The tool is capable of meaningful analysis of hundreds of post shares. Inferred diffusion networks demonstrate disparities in information distribution between user groups (confirmed versus unverified, conservative versus liberal) and local communities (political, entrepreneurial, etc.). A promising approach for quantifying post-impact, we observe discrepancies in inferred networks that indicate the disproportionate amount of automated bots. …


Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman Aug 2023

Statistical Inference On Lung Cancer Screening Using The National Lung Screening Trial Data., Farhin Rahman

Electronic Theses and Dissertations

This dissertation consists of three research projects on cancer screening probability modeling. In these projects, the three key modeling parameters (sensitivity, sojourn time, transition density) for cancer screening were estimated, along with the long-term outcomes (including overdiagnosis as one outcome), the optimal screening time/age, the lead time distribution, and the probability of overdiagnosis at the future screening time were simulated to provide a statistical perspective on the effectiveness of cancer screening programs. In the first part of this dissertation, a statistical inference was conducted for male and female smokers using the National Lung Screening Trial (NLST) chest X-ray data. A …


Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner Aug 2023

Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner

Electronic Theses and Dissertations

As regulations surrounding cannabis continue to develop, the demand for cannabis-based products is on the rise. Despite not producing the psychoactive effects commonly associated with THC, products containing cannabidiol (CBD) have gained immense popularity in recent years as a potential treatment option for a range of conditions, particularly those associated with pain or sleep disorders. However, due to current federal policies, these products have yet to undergo comprehensive safety and efficacy testing. Fortunately, utilizing advanced natural language processing (NLP) techniques, data harvested from social networks have been employed to investigate various social trends within healthcare, such as disease tracking and …


A Comparison Of Confidence Intervals In State Space Models, Jinyu Du Jul 2023

A Comparison Of Confidence Intervals In State Space Models, Jinyu Du

Statistical Science Theses and Dissertations

This thesis develops general procedures for constructing confidence intervals (CIs) of the error disturbance parameters (standard deviations) and transformations of the error disturbance parameters in time-invariant state space models (ssm). With only a set of observations, estimating individual error disturbance parameters accurately in the presence of other unknown parameters in ssm is a very challenging problem. We attempted to construct four different types of confidence intervals, Wald, likelihood ratio, score, and higher-order asymptotic intervals for both the simple local level model and the general time-invariant state space models (ssm). We show that for a simple local level model, both the …


Development And Testing Of A New Method For Velocity-Selecting White Dwarfs From Gaia By Galactic Population, Joseph Hammill Jul 2023

Development And Testing Of A New Method For Velocity-Selecting White Dwarfs From Gaia By Galactic Population, Joseph Hammill

Doctoral Dissertations and Master's Theses

The detailed processes by which spiral galaxies form remains an open question in modern cosmology. Observations of the current configuration of spiral galaxies including the Milky Way reveal thin and thick disk and halo populations which must all be accounted for in formation theories and likely have distinct ages. Using the Milky Way as an example to probe this question, we are studying the formation history of these structures.

This work details our approach to age-dating the galaxy, velocity-selecting targets from a sample of white dwarfs from the Gaia DR3 catalog that have also been age-analysed using BASE-9. BASE-9 uses …


Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici Jun 2023

Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici

Electronic Thesis and Dissertation Repository

Mark-recapture (MR) models typically assume that individuals under study have independent survival and recapture outcomes. One such model of interest is known as the Cormack-Jolly-Seber (CJS) model. In this dissertation, we conduct three major research projects focused on studying the impact of violating the independence assumption in MR models along with presenting extensions which relax the independence assumption. In the first project, we conduct a simulation study to address the impact of failing to account for pair-bonded animals having correlated recapture and survival fates on the CJS model. We examined the impact of correlation on the likelihood ratio test (LRT), …


Statistical And Biological Analyses Of Acoustic Signals In Estrildid Finches, Moises Rivera Jun 2023

Statistical And Biological Analyses Of Acoustic Signals In Estrildid Finches, Moises Rivera

Dissertations, Theses, and Capstone Projects

Acoustic communication is a process that involves auditory perception and signal processing. Discrimination and recognition further require cognitive processes and supporting mechanisms in order to successfully identify and appropriately respond to signal senders. Although acoustic communication is common across birds, classical research has largely disregarded the perceptual abilities of perinatal altricial taxa. Chapter 1 reviews the literature of perinatal acoustic stimulation in birds, highlighting the disproportionate focus on precocial birds (e.g., chickens, ducks, quails). The long-held belief that altricial birds were incapable of acoustic perception in ovo was only recently overturned, as researchers began to find behavioral and physiological evidence …


Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan May 2023

Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan

Computer Science Senior Theses

We introduce a framework that combines Gaussian Process models, robotic sensor measurements, and sampling data to predict spatial fields. In this context, a spatial field refers to the distribution of a variable throughout a specific area, such as temperature or pH variations over the surface of a lake. Whereas existing methods tend to analyze only the particular field(s) of interest, our approach optimizes predictions through the effective use of all available data. We validated our framework on several datasets, showing that errors can decline by up to two-thirds through the inclusion of additional colocated measurements. In support of adaptive sampling, …


Interannual Variation Of Ichthyofaunal Utilization Of A Man-Made Salt Marsh Creek In Mission Bay, California, Maria Angst May 2023

Interannual Variation Of Ichthyofaunal Utilization Of A Man-Made Salt Marsh Creek In Mission Bay, California, Maria Angst

Undergraduate Honors Theses

Marsh restoration and creation are increasingly being used to mitigate Southern California’s drastic decline in wetlands due to human activities. This study used minnow traps to resample the ichthyofauna of a created marsh (Crown Point Mitigation Site; CPMS) and an adjacent natural marsh (Kendall Frost) in Mission Bay, California, 26 years following the marsh creation. Data from this study were compared to data collected immediately after marsh creation from 1995-1998, and from 2021. Fishes captured included Fundulus parvipinnis, Gillichthys mirabilis, Acanthagobius flavimanus, Ctenogobius sagittula, and Mugil cephalus. Species richness and dominance measures were higher in the natural relative to the …


Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile May 2023

Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile

Statistical Science Theses and Dissertations

Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …


An Application Of The Pagerank Algorithm To Ncaa Football Team Rankings, Morgan Majors May 2023

An Application Of The Pagerank Algorithm To Ncaa Football Team Rankings, Morgan Majors

Honors Theses

We investigate the use of Google’s PageRank algorithm to rank sports teams. The PageRank algorithm is used in web searches to return a list of the websites that are of most interest to the user. The structure of the NCAA FBS football schedule is used to construct a network with a similar structure to the world wide web. Parallels are drawn between pages that are linked in the world wide web with the results of a contest between two sports teams. The teams under consideration here are the members of the 2021 Football Bowl Subdivision. We achieve a total ordering …


An Analysis Of All-Cause Mortality On Patients With Sickle Cell Disease And Kidney Disease Using Propensity Score Matching, Adam Garrison May 2023

An Analysis Of All-Cause Mortality On Patients With Sickle Cell Disease And Kidney Disease Using Propensity Score Matching, Adam Garrison

Electronic Theses and Dissertations

In this work, we provide an overview of the Cox proportional hazards model for time to event or survival analysis and the notion of propensity score matching to deal with confounding factors. A full analysis is reported in Chapter 2 concerning mortality for in-center dialysis patients with sickle cell disease to demonstrate the application of a general analysis strategy that has some logistical benefits over more traditional approaches to accounting for confounding variables. We also provide some insight and discussions on the challenges and future research questions that will emerge when trying to implement this strategy as a monitoring tool …


A Machine Learning Approach To Obese-Inflammatory Phenotyping, Tania Mayleth Vargas May 2023

A Machine Learning Approach To Obese-Inflammatory Phenotyping, Tania Mayleth Vargas

Theses and Dissertations

Obesity is the accumulation of an abnormal, or excessive, amount of fat in the body, which can have negative effects on overall health. This excess accumulation of macronutrients in adipose tissue can cause the release of inflammatory mediators, leading to a proinflammatory state. Inflammation is a known risk factor for various health conditions, including cardiovascular diseases, metabolic syndrome, and diabetes. This study sought to examine the use of data mining methods, particularly clustering algorithms, to identify inflammatory biomarker phenotypes and their association with obesity in a local adolescent population. The algorithms evaluated in this study included: k-means, Ward's hierarchical …