Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Masters Theses

Statistics and Probability

Institution
Keyword
Publication Year

Articles 1 - 30 of 73

Full-Text Articles in Physical Sciences and Mathematics

Genetic Associations Of Alzheimer’S Disease And Mild Cognitive Impairment, Scott Hebert Aug 2023

Genetic Associations Of Alzheimer’S Disease And Mild Cognitive Impairment, Scott Hebert

Masters Theses

Over 6 million people are estimated to have been living with Alzheimer’s Disease (AD) in 2020, with another 12 million living with Mild Cognitive Impairment (MCI). Research has been conducted to evaluate genetic links to AD, but more research is needed on the subject. The Alzheimer’s Disease Neuroimaging Initiative (ADNI) has been conducting a longitudinal study of AD and MCI since 2004 and offering their data to research teams around the world. Diagnostic and demographic data was collected from participants, as well as data regarding single nucleotide polymorphisms (SNPs). SNP data was transformed to a binary format regarding whether the …


Forecasting Covid-19 With Temporal Hierarchies And Ensemble Methods, Li Shandross Aug 2023

Forecasting Covid-19 With Temporal Hierarchies And Ensemble Methods, Li Shandross

Masters Theses

Infectious disease forecasting efforts underwent rapid growth during the COVID-19 pandemic, providing guidance for pandemic response and about potential future trends. Yet despite their importance, short-term forecasting models often struggled to produce accurate real-time predictions of this complex and rapidly changing system. This gap in accuracy persisted into the pandemic and warrants the exploration and testing of new methods to glean fresh insights.

In this work, we examined the application of the temporal hierarchical forecasting (THieF) methodology to probabilistic forecasts of COVID-19 incident hospital admissions in the United States. THieF is an innovative forecasting technique that aggregates time-series data into …


A Machine Learning Approach For Predicting Clinical Trial Patient Enrollment In Drug Development Portfolio Demand Planning, Ahmed Shoieb May 2023

A Machine Learning Approach For Predicting Clinical Trial Patient Enrollment In Drug Development Portfolio Demand Planning, Ahmed Shoieb

Masters Theses

One of the biggest challenges the clinical research industry currently faces is the accurate forecasting of patient enrollment (namely if and when a clinical trial will achieve full enrollment), as the stochastic behavior of enrollment can significantly contribute to delays in the development of new drugs, increases in duration and costs of clinical trials, and the over- or under- estimation of clinical supply. This study proposes a Machine Learning model using a Fully Convolutional Network (FCN) that is trained on a dataset of 100,000 patient enrollment data points including patient age, patient gender, patient disease, investigational product, study phase, blinded …


Meta-Analysis Of Mesenchymal Stem Cell Gene Expression Data From Obese And Non-Obese Patients, Dakota William Shields Jan 2023

Meta-Analysis Of Mesenchymal Stem Cell Gene Expression Data From Obese And Non-Obese Patients, Dakota William Shields

Masters Theses

"The prevalence of gene expression microarray datasets in public repositories gives opportunity to analyze biologically interesting datasets without running the laboratory aspect in house. Such experimentation is expensive in terms of finances, time, and expertise, which often results in low numbers of replicates. Meta-analysis techniques attempt to overcome issues due to few biological or technical replicates by combining separate experiments together to increase statistical power. Proper statistical considerations help to offset issues like simultaneous testing of thousands of genes, unintended hybridization, and other noises.

Microarrays contain light intensities from tens of thousands of hybridized probes giving a measure of gene …


Three Dimensional Spatio-Temporal Cluster Analysis Of Sars-Cov-2 Infections, Keith W. Allison Jun 2022

Three Dimensional Spatio-Temporal Cluster Analysis Of Sars-Cov-2 Infections, Keith W. Allison

Masters Theses

The COVID-19 pandemic has heightened the need for fine-scale analysis of the clustering of cases of infectious disease in order to better understand and prevent the localized spread of infection. The students living on the University of Massachusetts, Amherst campus provided a unique opportunity to do so, due to frequent mandatory testing during the 2020-2021 academic year, and dense living conditions. The South-West dormitory area is of particular interest due to its extremely high population density, housing around half of students living on campus during normal conditions. Using data gathered by the Public Health Promotion Center (PHPC), we analyzed the …


Continuous And Discrete Models For Optimal Harvesting In Fisheries, Nagham Abbas Al Qubbanchee Jan 2022

Continuous And Discrete Models For Optimal Harvesting In Fisheries, Nagham Abbas Al Qubbanchee

Masters Theses

"This work focuses on the logistic growth model, where the Gordon-Schaefer model is considered in continuous time. We view the Gordon-Schaefer model as a bioeconomic equation involved in the fishing business, considering biological rates, carrying capacity, and total marginal costs and revenues. In [25], the authors illustrate the analytical solution of the Schaefer model using the integration by parts method and two theorems. The theorems have many assumptions with many different strategies. Due to the nature of the problem, the optimal control system involves many equations and functions, such as the second root of the equation. We concentrate on Theorem …


Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont Oct 2021

Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont

Masters Theses

Ecological inquiry is rooted fundamentally in understanding population abundance, both to develop theory and improve conservation outcomes. Despite this importance, estimating abundance is difficult due to the imperfect detection of individuals in a sample population. Further, accounting for space can provide more biologically realistic inference, shifting the focus from abundance to density and encouraging the exploration of spatial processes. To address these challenges, Spatial Capture-Recapture (“SCR”) has emerged as the most prominent method for estimating density reliably. The SCR model is conceptually straightforward: it combines a spatial model of detection with a point process model of the spatial distribution of …


Evaluating Public Masking Mandates On Covid-19 Growth Rates In U.S. States, Angus K. Wong Jul 2021

Evaluating Public Masking Mandates On Covid-19 Growth Rates In U.S. States, Angus K. Wong

Masters Theses

U.S. state governments have implemented numerous policies to help mitigate the spread of COVID-19. While there is strong biological evidence supporting the wearing of face masks or coverings in public spaces, the impact of public masking policies remains unclear. We aimed to evaluate how early versus delayed implementation of state-level public masking orders impacted subsequent COVID-19 growth rates. We defined “early” implementation as having a state-level mandate in place before September 1, 2020, the approximate start of the school-year. We defined COVID-19 growth rates as the relative increase in confirmed cases 7, 14, 21, 30, 45, 60-days after September 1. …


Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li Dec 2020

Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li

Masters Theses

Machine learning hyperparameter optimization has always been the key to improve model performance. There are many methods of hyperparameter optimization. The popular methods include grid search, random search, manual search, Bayesian optimization, population-based optimization, etc. Random search occupies less computations than the grid search, but at the same time there is a penalty for accuracy. However, this paper proposes a more effective random search method based on the traditional random search and hyperparameter space separation. This method is named random search plus. This thesis empirically proves that random search plus is more effective than random search. There are some case …


Empirical Modeling Of Used Nuclear Fuel Radiation Emissions For Safeguards Purposes, Amanda M. Bachmann Aug 2020

Empirical Modeling Of Used Nuclear Fuel Radiation Emissions For Safeguards Purposes, Amanda M. Bachmann

Masters Theses

For nuclear nonproliferation safeguards, the ability to characterize used nuclear fuel (UNF) is a vital process. Fuel characterization allows for independent verification by inspectors of operator declarations of the special nuclear material flow and nuclear related activities within a facility, and an estimation of fissile material remaining in a fuel assembly. Current methods to verify this information rely heavily on non-destructive assay techniques, such as gamma spectroscopy and neutron detection measurements. While these measurements are effective tools for estimating a specific characteristic of the fuel, such as burnup or cooling time, they often require an accurate estimation of a select …


Quantifying Effects Of Sleep Deprivation On Cognitive Performance, Quang Nghia Le Jan 2020

Quantifying Effects Of Sleep Deprivation On Cognitive Performance, Quang Nghia Le

Masters Theses

“The most commonly used metric for evaluating alertness and vigilance is the Psychomotor Vigilance Test (PVT), previous studies have indicated that alertness and vigilance can be affected by the lack of sleep as a function of sleep loss. This study explores methods to predict median psychomotor vigilance reaction times. The data used in this study comes from a series of tests and surveys conducted on volunteer students. The data set contains many potential predictors of PVT and one aspect of the study was to identify variables that are useful in prediction. The performances of various prediction methods that allow for …


The Application Of Machine Learning Models In The Concussion Diagnosis Process, Sujit Subhash Jan 2020

The Application Of Machine Learning Models In The Concussion Diagnosis Process, Sujit Subhash

Masters Theses

“Concussions represent a growing health concern and are challenging to diagnose and manage. Roughly four million concussions are diagnosed every year in the United States. Although research into the application of advanced metrics such as neuroimages and blood biomarkers has shown promise, they are yet to be implemented at a clinical level due to cost and reliability concerns. Therefore, concussion diagnosis is still reliant on clinical evaluations of symptoms, balance, and neurocognitive status and function. The lack of a universal threshold on these assessments makes the diagnosis process entirely reliant on a physician’s interpretation of these assessment scores. This study …


The Correlation Between Sleep And Lifespan In Drosophila Melanogaster, Joshua Randall Lisse Jan 2019

The Correlation Between Sleep And Lifespan In Drosophila Melanogaster, Joshua Randall Lisse

Masters Theses

”Adequate sleep is associated with an individual’s health. Too little sleep is associated with many health problems, including cardiovascular disease, obesity, and a general increase in all-cause mortality. Yet the molecular changes that link poor sleep and changes in health are still not well understood. Individuals have a unique daily need for sleep, and deviations from the animal’s regular sleeping patterns can be indicative of, or result in, underlying changes in its health. Therefore, we hypothesize that changes in the sleep architecture in Drosophila melanogaster reflect changes in the fly’s health.

We determined sleep architecture in wild-type male flies over …


Less Is More: Beating The Market With Recurrent Reinforcement Learning, Louis Kurt Bernhard Steinmeister Jan 2019

Less Is More: Beating The Market With Recurrent Reinforcement Learning, Louis Kurt Bernhard Steinmeister

Masters Theses

"Multiple recurrent reinforcement learners were implemented to make trading decisions based on real and freely available macro-economic data. The learning algorithm and different reinforcement functions (the Differential Sharpe Ratio, Differential Downside Deviation Ratio and Returns) were revised and the performances were compared while transaction costs were taken into account. (This is important for practical implementations even though many publications ignore this consideration.) It was assumed that the traders make long-short decisions in the S&P500 with complementary 3-month treasury bill investments. Leveraged positions in the S&P500 were disallowed. Notably, the Differential Sharpe Ratio and the Differential Downside Deviation Ratio are risk …


Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak Oct 2018

Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak

Masters Theses

Dengue fever affects over 390 million people annually worldwide and is of particu- lar concern in Southeast Asia where it is one of the leading causes of hospitalization. Modeling trends in dengue occurrence can provide valuable information to Public Health officials, however many challenges arise depending on the data available. In Thailand, reporting of dengue cases is often delayed by more than 6 weeks, and a small fraction of cases may not be reported until over 11 months after they occurred. This study shows that incorporating data on Google Search trends can improve dis- ease predictions in settings with severely …


A Study On Modelling Spatial-Temporal Human Mobility Patterns For Improving Personalized Weather Warning, Yue Xu Jul 2018

A Study On Modelling Spatial-Temporal Human Mobility Patterns For Improving Personalized Weather Warning, Yue Xu

Masters Theses

Understanding human mobility patterns is important for severe weather warning since these patterns can help identify where people are in time and in space when flash floods, tornados, high winds and hurricanes are occurring or are predicted to occur. A GIS (Geographic Information Science) data model was proposed to describe the spatial-temporal human activity. Based on this model, a metric was designed to represent the spatial-temporal activity intensity of human mobility, and an index was generated to quantitatively describe the change in human activities. By analyzing high-resolution human mobility data, the paper verified that human daily mobility patterns could be …


Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin Jul 2018

Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin

Masters Theses

Alzheimer's Disease (AD) affects nearly 5.4 million Americans as of 2016 and is the most common form of dementia. The disease is characterized by the presence of neurofibrillary tangles and amyloid plaques [1]. The amount of plaques are measured by Braak stage, post-mortem. It is known that AD is positively associated with hypercholesterolemia [16]. As statins are the most widely used cholesterol-lowering drug, there may be associations between statin use and AD. We hypothesize that those who use statins, specifically lipophilic statins, are more likely to have a low Braak stage in post-mortem analysis.

In order to address this hypothesis, …


Analyzing Sensor Based Human Activity Data Using Time Series Segmentation To Determine Sleep Duration, Yogesh Deepak Lad Jan 2018

Analyzing Sensor Based Human Activity Data Using Time Series Segmentation To Determine Sleep Duration, Yogesh Deepak Lad

Masters Theses

"Sleep is the most important thing to rest our brain and body. A lack of sleep has adverse effects on overall personal health and may lead to a variety of health disorders. According to Data from the Center for disease control and prevention in the United States of America, there is a formidable increase in the number of people suffering from sleep disorders like insomnia, sleep apnea, hypersomnia and many more. Sleep disorders can be avoided by assessing an individual's activity over a period of time to determine the sleep pattern and duration. The sleep pattern and duration can be …


Developing Leading And Lagging Indicators To Enhance Equipment Reliability In A Lean System, Dhanush Agara Mallesh Dec 2017

Developing Leading And Lagging Indicators To Enhance Equipment Reliability In A Lean System, Dhanush Agara Mallesh

Masters Theses

With increasing complexity in equipment, the failure rates are becoming a critical metric due to the unplanned maintenance in a production environment. Unplanned maintenance in manufacturing process is created issues with downtimes and decreasing the reliability of equipment. Failures in equipment have resulted in the loss of revenue to organizations encouraging maintenance practitioners to analyze ways to change unplanned to planned maintenance. Efficient failure prediction models are being developed to learn about the failures in advance. With this information, failures predicted can reduce the downtimes in the system and improve the throughput.

The goal of this thesis is to predict …


Juvenile River Herring In Freshwater Lakes: Sampling Approaches For Evaluating Growth And Survival, Matthew T. Devine Oct 2017

Juvenile River Herring In Freshwater Lakes: Sampling Approaches For Evaluating Growth And Survival, Matthew T. Devine

Masters Theses

River herring, collectively alewives (Alosa pseudoharengus) and blueback herring (A. aestivalis), have experienced substantial population declines over the past five decades due in large part to overfishing, combined with other sources of mortality, and disrupted access to critical freshwater spawning habitats. Anadromous river herring populations are currently assessed by counting adults in rivers during upstream spawning migrations, but no field-based assessment methods exist for estimating juvenile densities in freshwater nursery habitats. Counts of 4-year-old migrating adults are variable and prevent understanding about how mortality acts on different life stages prior to returning to spawn (e.g., juveniles …


Modelling Bird Migration With Motus Data And Bayesian State-Space Models, Justin Baldwin Oct 2017

Modelling Bird Migration With Motus Data And Bayesian State-Space Models, Justin Baldwin

Masters Theses

Bird migration is a poorly-known yet important phenomenon, as understanding movement patterns of birds can inform conservation strategies and public health policy for animal-borne diseases. Recent advances in wildlife tracking technology, in particular the Motus system, have allowed researchers to track even small flying birds and insects with radio transmitters that weigh fractions of a gram. This system relies on a community-based distributed sensor network that detects tagged animals as they move through the detection nodes on journeys that range from small local movements to intercontinental migrations. The quantity of data generated by the Motus system is unprecedented, is on …


Multiple Testing Correction With Repeated Correlated Outcomes: Applications To Epigenetics, Katie Leap Oct 2017

Multiple Testing Correction With Repeated Correlated Outcomes: Applications To Epigenetics, Katie Leap

Masters Theses

Epigenetic changes (specifically DNA methylation) have been associated with adverse health outcomes; however, unlike genetic markers that are fixed over the lifetime of an individual, methylation can change. Given that there are a large number of methylation sites, measuring them repeatedly introduces multiple testing problems beyond those that exist in a static genetic context. Using simulations of epigenetic data, we considered different methods of controlling the false discovery rate. We considered several underlying associations between an exposure and methylation over time.

We found that testing each site with a linear mixed effects model and then controlling the false discovery rate …


Spatially Explicit Population Estimates Of The Florida Black Bear, Jacob Michael Humm May 2017

Spatially Explicit Population Estimates Of The Florida Black Bear, Jacob Michael Humm

Masters Theses

The Florida black bear (Ursus americanus floridanus) is currently comprised of 7 isolated subpopulations: Apalachicola, Eglin, Osceola, Ocala/St. Johns, Chassahowitzka, Highlands/Glades, and Big Cypress. The last statewide assessment of Florida black bear population dynamics was conducted by Simek et al. (2005) using traditional capture-markrecapture methods. The subspecies was removed from Florida’s List of State Threatened Species in 2012 contingent upon the formulation of a management plan that would maintain viable subpopulations of black bears in suitable habitat. Accurate population estimates for each of the remaining black bear subpopulations in Florida were needed to achieve the management goals of …


Explorations Into Machine Learning Techniques For Precipitation Nowcasting, Aditya Nagarajan Mar 2017

Explorations Into Machine Learning Techniques For Precipitation Nowcasting, Aditya Nagarajan

Masters Theses

Recent advances in cloud-based big-data technologies now makes data driven solutions feasible for increasing numbers of scientific computing applications. One such data driven solution approach is machine learning where patterns in large data sets are brought to the surface by finding complex mathematical relationships within the data. Nowcasting or short-term prediction of rainfall in a given region is an important problem in meteorology. In this thesis we explore the nowcasting problem through a data driven approach by formulating it as a machine learning problem.

State-of-the-art nowcasting systems today are based on numerical models which describe the physical processes leading to …


Development Of A Variogram Approach To Spatial Outlier Detection Using A Supplemental Digital Elevation Model Dataset, Zane Daniel Helwig Jan 2017

Development Of A Variogram Approach To Spatial Outlier Detection Using A Supplemental Digital Elevation Model Dataset, Zane Daniel Helwig

Masters Theses

"When developing a ground water model, the quality of the dataset should first be evaluated. Spatial outliers can lead to predictions which are not representative of actual conditions. In order to isolate misrepresentative points, a method is presented which examines the experimental variogram of a ground water elevation dataset. To define a threshold variance between pairs of ground water elevation measures, ground elevation values from a digital elevation model (DEM) are used to determine a maximum reasonable variance expected to occur on the experimental variogram. To determine appropriate DEM parameters, a separate study was also done which observed characteristic behavior …


Comparing Region Level Testing Methods For Differential Dna Methylation Analysis, Arnold Albert Harder Jan 2017

Comparing Region Level Testing Methods For Differential Dna Methylation Analysis, Arnold Albert Harder

Masters Theses

”Finding possible connections and solutions to help fight progression of diseases is a major area of research. Genomics is a primary path of research in disease research. Through the DNA sequence, possible connections to diseases have been found. However, most methods for fixing issues within a DNA sequence are still out of reach. One potential path is to investigate epigenetic modifications, such as DNA methylation. DNA methylation occurs when a methyl group attached to cytosines on the DNA sequence. Statistical methods can be used to identify sites or regions of significant differences in methylation levels between groups ( e. g. …


Family-Based Association Studies Of Autism In Boys Via Facial-Feature Clusters, Luke Andrew Settles Jan 2017

Family-Based Association Studies Of Autism In Boys Via Facial-Feature Clusters, Luke Andrew Settles

Masters Theses

"Autism spectrum disorder (ASD) refers to a set of developmental disorders with varied attributes. Due to its substantial heterogeneity in terms of behavioral and clinical phenotypes, it is challenging to discern the genetic biomarkers behind ASD, even though the disease is known to be genetic in nature. This serves as a motivation to detect relationships between single nucleotide polymorphisms (SNPs) and a causal autism disease susceptibility locus (DSL) within more homogeneous subgroups. Recently, clinically meaningful subclassifications of ASD have been discovered utilizing facial features of prepubescent boys. Therefore, through the employment of data from 44 prepubertal Caucasian boys with ASD …


A Comparison Of Techniques For Handling Missing Data In Longitudinal Studies, Alexander R. Bogdan Nov 2016

A Comparison Of Techniques For Handling Missing Data In Longitudinal Studies, Alexander R. Bogdan

Masters Theses

Missing data are a common problem in virtually all epidemiological research, especially when conducting longitudinal studies. In these settings, clinicians may collect biological samples to analyze changes in biomarkers, which often do not conform to parametric distributions and may be censored due to limits of detection. Using complete data from the BioCycle Study (2005-2007), which followed 259 premenopausal women over two menstrual cycles, we compared four techniques for handling missing biomarker data with non-Normal distributions. We imposed increasing degrees of missing data on two non-Normally distributed biomarkers under conditions of missing completely at random, missing at random, and missing not …


Regional Dynamic Price Relationships Of Distillers Dried Grains In U.S. Feed Markets, Matthew Fulton Johnson Aug 2016

Regional Dynamic Price Relationships Of Distillers Dried Grains In U.S. Feed Markets, Matthew Fulton Johnson

Masters Theses

Distillers dried grains with solubles (DDGS) is now a mainstream substitute in U.S. animal feed rations. DDGS is rich in fat and protein content and serves as a competitive feed source in livestock markets. The objective of this study is to identify dynamic price relationships among DDGS, corn, soybean meal, and livestock outputs in context of specific livestock sectors and their geographic location. Four locations associated with a predominant livestock sector are selected for analysis by measuring density and relative proportion of a livestock sector’s grain consumption at the county level. A vector error correction model is applied to post-mandate …


Sleep Patterns, Urinary Levels Of Melatonin And Subsequent Weight Change In The Women’S Health Initiative Observational Study, Nicole M. Barron Jul 2016

Sleep Patterns, Urinary Levels Of Melatonin And Subsequent Weight Change In The Women’S Health Initiative Observational Study, Nicole M. Barron

Masters Theses

Results from prospective studies examining associations between sleep duration and weight gain have been mixed. Melatonin has been hypothesized to mediate the association between sleep duration and weight/body composition. In cross-sectional studies, aMT6s has been shown to be inversely associated with weight/body fat percentage. We examined associations between baseline sleep duration, insomnia status, aMT6s levels with weight/body fat percentage through 6 years, utilizing a subset 690 women who participated in a breast cancer case-control study nested within the WHI-OS. Multi-variable and mixed-effects regression was used to calculate beta-coefficients and 95% confidence intervals. Cross-sectional analyses showed urinary aMT6s levels were inversely …