Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 27 of 27

Full-Text Articles in Physical Sciences and Mathematics

Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako Nov 2023

Nonparametric Derivative Estimation Using Penalized Splines: Theory And Application, Bright Antwi Boasiako

Doctoral Dissertations

This dissertation is in the field of Nonparametric Derivative Estimation using
Penalized Splines. It is conducted in two parts. In the first part, we study the L2
convergence rates of estimating derivatives of mean regression functions using penalized splines. In 1982, Stone provided the optimal rates of convergence for estimating derivatives of mean regression functions using nonparametric methods. Using these rates, Zhou et. al. in their 2000 paper showed that the MSE of derivative estimators based on regression splines approach zero at the optimal rate of convergence. Also, in 2019, Xiao showed that, under some general conditions, penalized spline estimators …


Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont Oct 2021

Statistical Improvements For Ecological Learning About Spatial Processes, Gaetan L. Dupont

Masters Theses

Ecological inquiry is rooted fundamentally in understanding population abundance, both to develop theory and improve conservation outcomes. Despite this importance, estimating abundance is difficult due to the imperfect detection of individuals in a sample population. Further, accounting for space can provide more biologically realistic inference, shifting the focus from abundance to density and encouraging the exploration of spatial processes. To address these challenges, Spatial Capture-Recapture (“SCR”) has emerged as the most prominent method for estimating density reliably. The SCR model is conceptually straightforward: it combines a spatial model of detection with a point process model of the spatial distribution of …


Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan Oct 2021

Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan

Doctoral Dissertations

Carnivores are distributed widely and threatened by habitat loss, poaching, climate change, and disease. They are considered integral to ecosystem function through their direct and indirect interactions with species at different trophic levels. Given the importance of carnivores, it is of high conservation priority to understand the processes driving carnivore assemblages in different systems. It is thus essential to determine the abiotic and biotic drivers of carnivore community composition at different spatial scales and address the following questions: (i) What factors influence carnivore community composition and diversity? (ii) How do the factors influencing carnivore communities vary across spatial and temporal …


Using Generalizability And Rasch Measurement Theory To Ensure Rigorous Measurement In An International Development Education Evaluation, Louise Bahry Oct 2021

Using Generalizability And Rasch Measurement Theory To Ensure Rigorous Measurement In An International Development Education Evaluation, Louise Bahry

Doctoral Dissertations

Between the United States and Great Britain, over 30 billion USD was spent in 2018 on international aid, over a billion of which is dedicated to education programs alone. Recently, there has been increased attention on the rigorous evaluation of aid-funded programs, moving beyond counting outputs to the measurement of educational impact. The current study uses two methodological approaches (Generalizability (Brennan, 1992, 2001) and Rasch Measurement Theory (Andrich, 1978; Rasch, 1980; Wright & Masters, 1982) to analyze data from math and literacy assessments, and self-report surveys used in an international evaluation of an educational initiative in the Democratic Republic of …


Evaluating Public Masking Mandates On Covid-19 Growth Rates In U.S. States, Angus K. Wong Jul 2021

Evaluating Public Masking Mandates On Covid-19 Growth Rates In U.S. States, Angus K. Wong

Masters Theses

U.S. state governments have implemented numerous policies to help mitigate the spread of COVID-19. While there is strong biological evidence supporting the wearing of face masks or coverings in public spaces, the impact of public masking policies remains unclear. We aimed to evaluate how early versus delayed implementation of state-level public masking orders impacted subsequent COVID-19 growth rates. We defined “early” implementation as having a state-level mandate in place before September 1, 2020, the approximate start of the school-year. We defined COVID-19 growth rates as the relative increase in confirmed cases 7, 14, 21, 30, 45, 60-days after September 1. …


Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang Jul 2021

Model-Free Descriptive Modeling For Multivariate Categorical Data With An Ordinal Dependent Variable, Li Wang

Doctoral Dissertations

In the process of statistical modeling, the descriptive modeling plays an essential role in accelerating the formulation of plausible hypotheses in the subsequent explanatory modeling and facilitating the selection of potential variables in the subsequent predictive modeling. Especially, for multivariate categorical data analysis, it is desirable to use the descriptive modeling methods for uncovering and summarizing the potential association structure among multiple categorical variables in a compact manner. However, many classical methods in this case either rely on strong assumptions for parametric models or become infeasible when the data dimension is higher. To this end, we propose a model-free method …


Geometric Representation Learning, Luke Vilnis Apr 2021

Geometric Representation Learning, Luke Vilnis

Doctoral Dissertations

Vector embedding models are a cornerstone of modern machine learning methods for knowledge representation and reasoning. These methods aim to turn semantic questions into geometric questions by learning representations of concepts and other domain objects in a lower-dimensional vector space. In that spirit, this work advocates for density- and region-based representation learning. Embedding domain elements as geometric objects beyond a single point enables us to naturally represent breadth and polysemy, make asymmetric comparisons, answer complex queries, and provides a strong inductive bias when labeled data is scarce. We present a model for word representation using Gaussian densities, enabling asymmetric entailment …


Interacting Effects Of Climate And Biotic Factors On Mesocarnivore Distribution And Snowshoe Hare Demography Along The Boreal-Temperate Ecotone, Alexej P. Siren Jul 2020

Interacting Effects Of Climate And Biotic Factors On Mesocarnivore Distribution And Snowshoe Hare Demography Along The Boreal-Temperate Ecotone, Alexej P. Siren

Doctoral Dissertations

The motivation of my dissertation research was to understand the influence of climate and biotic factors on range limits with a focus on winter-adapted species, including the Canada lynx (Lynx canadensis), American marten (Martes americana), and snowshoe hare (Lepus americanus). I investigated range dynamics along the boreal-temperate ecotone of the northeastern US. Through an integrative literature review, I developed a theoretical framework building from existing thinking on range limits and ecological theory. I used this theory for my second chapter to evaluate direct and indirect causes of carnivore range limits in the northeastern US, …


Latent Class Models For At-Risk Populations, Shuaimin Kang Jul 2020

Latent Class Models For At-Risk Populations, Shuaimin Kang

Doctoral Dissertations

Clustering Network Tree Data From Respondent-Driven Sampling With Application to Opioid Users in New York City There is great interest in finding meaningful subgroups of attributed network data. There are many available methods for clustering complete network. Unfortunately, much network data is collected through sampling, and therefore incomplete. Respondent-driven sampling (RDS) is a widely used method for sampling hard-to-reach human populations based on tracing links in the underlying unobserved social network. The resulting data therefore have tree structure representing a sub-sample of the network, along with many nodal attributes. In this paper, we introduce an approach to adjust mixture models …


Allocative Poisson Factorization For Computational Social Science, Aaron Schein Jul 2019

Allocative Poisson Factorization For Computational Social Science, Aaron Schein

Doctoral Dissertations

Social science data often comes in the form of high-dimensional discrete data such as categorical survey responses, social interaction records, or text. These data sets exhibit high degrees of sparsity, missingness, overdispersion, and burstiness, all of which present challenges to traditional statistical modeling techniques. The framework of Poisson factorization (PF) has emerged in recent years as a natural way to model high-dimensional discrete data sets. This framework assumes that each observed count in a data set is a Poisson random variable $y ~ Pois(\mu)$ whose rate parameter $\mu$ is a function of shared model parameters. This thesis examines a specific …


Population Viability And Connectivity Of The Federally Threatened Eastern Indigo Snake In Central Peninsular Florida, Javan Bauder Mar 2019

Population Viability And Connectivity Of The Federally Threatened Eastern Indigo Snake In Central Peninsular Florida, Javan Bauder

Doctoral Dissertations

Understanding the factors influencing the likelihood of persistence of real-world populations requires both an accurate understanding of the traits and behaviors of individuals within those populations (e.g., movement, habitat selection, survival, fecundity, dispersal) but also an understanding of how those traits and behaviors are influenced by landscape features. The federally threatened eastern indigo snake (EIS, Drymarchon couperi) has declined throughout its range primarily due to anthropogenically-induced habitat loss and fragmentation making spatially-explicit assessments of population viability and connectivity essential for understanding its current status and directing future conservation efforts. The primary goal of my dissertation was to understand how …


Essays In Financial Economics: Announcement Effects In Fixed Income Markets, James J. Forest Oct 2018

Essays In Financial Economics: Announcement Effects In Fixed Income Markets, James J. Forest

Doctoral Dissertations

ABSTRACT ESSAYS IN FINANCIAL ECONOMICS: ANNOUNCEMENT EFFECTS IN FIXED INCOME MARKETS PHD IN FINANCE MAY 2018 JAMES J FOREST B.A., FRAMINGHAM STATE UNIVERSITY M.S., NORTHEASTERN UNIVERSITY Ph.D., UNIVERSITY OF MASSACHUSETTS – AMHERST Directed by: Professor Hossein B. Kazemi This dissertation demonstrates the use of empirical techniques for dealing with modeling issues that arise when analyzing announcement effects in fixed income markets. It describes empirical challenges in achieving unbiased and efficient parameter estimates and shows the importance of modelling a wide range of macroeconomic announcement effects to avoid omitted variable bias. Employing techniques common in Macroeconomics, financial market researchers are better …


Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak Oct 2018

Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak

Masters Theses

Dengue fever affects over 390 million people annually worldwide and is of particu- lar concern in Southeast Asia where it is one of the leading causes of hospitalization. Modeling trends in dengue occurrence can provide valuable information to Public Health officials, however many challenges arise depending on the data available. In Thailand, reporting of dengue cases is often delayed by more than 6 weeks, and a small fraction of cases may not be reported until over 11 months after they occurred. This study shows that incorporating data on Google Search trends can improve dis- ease predictions in settings with severely …


A Study On Modelling Spatial-Temporal Human Mobility Patterns For Improving Personalized Weather Warning, Yue Xu Jul 2018

A Study On Modelling Spatial-Temporal Human Mobility Patterns For Improving Personalized Weather Warning, Yue Xu

Masters Theses

Understanding human mobility patterns is important for severe weather warning since these patterns can help identify where people are in time and in space when flash floods, tornados, high winds and hurricanes are occurring or are predicted to occur. A GIS (Geographic Information Science) data model was proposed to describe the spatial-temporal human activity. Based on this model, a metric was designed to represent the spatial-temporal activity intensity of human mobility, and an index was generated to quantitatively describe the change in human activities. By analyzing high-resolution human mobility data, the paper verified that human daily mobility patterns could be …


Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin Jul 2018

Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin

Masters Theses

Alzheimer's Disease (AD) affects nearly 5.4 million Americans as of 2016 and is the most common form of dementia. The disease is characterized by the presence of neurofibrillary tangles and amyloid plaques [1]. The amount of plaques are measured by Braak stage, post-mortem. It is known that AD is positively associated with hypercholesterolemia [16]. As statins are the most widely used cholesterol-lowering drug, there may be associations between statin use and AD. We hypothesize that those who use statins, specifically lipophilic statins, are more likely to have a low Braak stage in post-mortem analysis.

In order to address this hypothesis, …


Deep Energy-Based Models For Structured Prediction, David Belanger Nov 2017

Deep Energy-Based Models For Structured Prediction, David Belanger

Doctoral Dissertations

We introduce structured prediction energy networks (SPENs), a flexible frame- work for structured prediction. A deep architecture is used to define an energy func- tion over candidate outputs and predictions are produced by gradient-based energy minimization. This deep energy captures dependencies between labels that would lead to intractable graphical models, and allows us to automatically discover discrim- inative features of the structured output. Furthermore, practitioners can explore a wide variety of energy function architectures without having to hand-design predic- tion and learning methods for each model. This is because all of our prediction and learning methods interact with the energy …


Statistical Methods On Risk Management Of Extreme Events, Zijing Zhang Jul 2017

Statistical Methods On Risk Management Of Extreme Events, Zijing Zhang

Doctoral Dissertations

The goal of the dissertation is the investigation of financial risk analysis methodologies, using the schemes for extreme value modeling as well as techniques from copula modeling. Extreme value theory is concerned with probabilistic and statistical questions re- lated to unusual behavior or rare events. The subject has a rich mathematical theory and also a long tradition of applications in a variety of areas. We are interested in its application in risk management, with a focus on estimating and forcasting the Value-at-Risk of financial time series data. Extremal data are inherently scarce, thus making inference challenging. In order to obtain …


Explorations Into Machine Learning Techniques For Precipitation Nowcasting, Aditya Nagarajan Mar 2017

Explorations Into Machine Learning Techniques For Precipitation Nowcasting, Aditya Nagarajan

Masters Theses

Recent advances in cloud-based big-data technologies now makes data driven solutions feasible for increasing numbers of scientific computing applications. One such data driven solution approach is machine learning where patterns in large data sets are brought to the surface by finding complex mathematical relationships within the data. Nowcasting or short-term prediction of rainfall in a given region is an important problem in meteorology. In this thesis we explore the nowcasting problem through a data driven approach by formulating it as a machine learning problem.

State-of-the-art nowcasting systems today are based on numerical models which describe the physical processes leading to …


Inference In Networking Systems With Designed Measurements, Chang Liu Mar 2017

Inference In Networking Systems With Designed Measurements, Chang Liu

Doctoral Dissertations

Networking systems consist of network infrastructures and the end-hosts have been essential in supporting our daily communication, delivering huge amount of content and large number of services, and providing large scale distributed computing. To monitor and optimize the performance of such networking systems, or to provide flexible functionalities for the applications running on top of them, it is important to know the internal metrics of the networking systems such as link loss rates or path delays. The internal metrics are often not directly available due to the scale and complexity of the networking systems. This motivates the techniques of inference …


Wind Power Capacity Value Metrics And Variability: A Study In New England, Frederick W. Letson Nov 2015

Wind Power Capacity Value Metrics And Variability: A Study In New England, Frederick W. Letson

Doctoral Dissertations

Capacity value is the contribution of a power plant to the ability of the power system to meet high demand. As wind power penetration in New England, and worldwide, increases so does the importance of identifying the capacity contribution made by wind power plants. It is critical to accurately characterize the capacity value of these wind power plants and the variability of the capacity value over the long term. This is important in order to avoid the cost of keeping extra power plants operational while still being able to cover the demand for power reliably. This capacity value calculation is …


Variable Selection In Single Index Varying Coefficient Models With Lasso, Peng Wang Nov 2015

Variable Selection In Single Index Varying Coefficient Models With Lasso, Peng Wang

Doctoral Dissertations

Single index varying coefficient model is a very attractive statistical model due to its ability to reduce dimensions and easy-of-interpretation. There are many theoretical studies and practical applications with it, but typically without features of variable selection, and no public software is available for solving it. Here we propose a new algorithm to fit the single index varying coefficient model, and to carry variable selection in the index part with LASSO. The core idea is a two-step scheme which alternates between estimating coefficient functions and selecting-and-estimating the single index. Both in simulation and in application to a Geoscience dataset, we …


Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis Jan 2015

Robust Optimization Of Biological Protocols, Patrick Flaherty, Ronald W. Davis

Mathematics and Statistics Department Faculty Publication Series

When conducting high-throughput biological experiments, it is often necessary to develop a protocol that is both inexpensive and robust. Standard approaches are either not cost-effective or arrive at an optimized protocol that is sensitive to experimental variations. Here, we describe a novel approach that directly minimizes the cost of the protocol while ensuring the protocol is robust to experimental variation. Our approach uses a risk-averse conditional value-at-risk criterion in a robust parameter design framework. We demonstrate this approach on a polymerase chain reaction protocol and show that our improved protocol is less expensive than the standard protocol and more robust …


Incorporating Boltzmann Machine Priors For Semantic Labeling In Images And Videos, Andrew Kae Aug 2014

Incorporating Boltzmann Machine Priors For Semantic Labeling In Images And Videos, Andrew Kae

Doctoral Dissertations

Semantic labeling is the task of assigning category labels to regions in an image. For example, a scene may consist of regions corresponding to categories such as sky, water, and ground, or parts of a face such as eyes, nose, and mouth. Semantic labeling is an important mid-level vision task for grouping and organizing image regions into coherent parts. Labeling these regions allows us to better understand the scene itself as well as properties of the objects in the scene, such as their parts, location, and interaction within the scene. Typical approaches for this task include the conditional random field …


Spatial And Temporal Correlations Of Freeway Link Speeds: An Empirical Study, Piotr J. Rachtan Jan 2012

Spatial And Temporal Correlations Of Freeway Link Speeds: An Empirical Study, Piotr J. Rachtan

Masters Theses 1911 - February 2014

Congestion on roadways and high level of uncertainty of traffic conditions are major considerations for trip planning. The purpose of this research is to investigate the characteristics and patterns of spatial and temporal correlations and also to detect other variables that affect correlation in a freeway setting. 5-minute speed aggregates from the Performance Measurement System (PeMS) database are obtained for two directions of an urban freeway – I-10 between Santa Monica and Los Angeles, California. Observations are for all non-holiday weekdays between January 1st and June 30th, 2010. Other variables include traffic flow, ramp locations, number of lanes and the …


Determinants Of Health Care Use Among Rural, Low-Income Mothers And Children: A Simultaneous Systems Approach To Negative Binomial Regression Modeling, Swetha Valluri Jan 2011

Determinants Of Health Care Use Among Rural, Low-Income Mothers And Children: A Simultaneous Systems Approach To Negative Binomial Regression Modeling, Swetha Valluri

Masters Theses 1911 - February 2014

The determinants of health care use among rural, low-income mothers and their children were assessed using a multi-state, longitudinal data set, Rural Families Speak. The results indicate that rural mothers’ decisions regarding health care utilization for themselves and for their child can be best modeled using a simultaneous systems approach to negative binomial regression. Mothers’ visits to a health care provider increased with higher self-assessed depression scores, increased number of child’s doctor visits, greater numbers of total children in the household, greater numbers of chronic conditions, need for prenatal or post-partum care, development of a new medical condition, and …


Dynamic Model Pooling Methodology For Improving Aberration Detection Algorithms, Brenton J. Sellati Jan 2010

Dynamic Model Pooling Methodology For Improving Aberration Detection Algorithms, Brenton J. Sellati

Masters Theses 1911 - February 2014

Syndromic surveillance is defined generally as the collection and statistical analysis of data which are believed to be leading indicators for the presence of deleterious activities developing within a system. Conceptually, syndromic surveillance can be applied to any discipline in which it is important to know when external influences manifest themselves in a system by forcing it to depart from its baseline. Comparing syndromic surveillance systems have led to mixed results, where models that dominate in one performance metric are often sorely deficient in another. This results in a zero-sum trade off where one performance metric must be afforded greater …


A Study Of Indoor Carbon Dioxide Levels And Sick Leave Among Office Workers, Theodore A. Myatt, John W. Staudenmayer, Kate Adams, Michael Walters, Stephen N. Rudnick, Donald K. Milton Oct 2002

A Study Of Indoor Carbon Dioxide Levels And Sick Leave Among Office Workers, Theodore A. Myatt, John W. Staudenmayer, Kate Adams, Michael Walters, Stephen N. Rudnick, Donald K. Milton

John W Staudenmayer

Background A previous observational study detected a strong positive relationship between sick leave absences and carbon dioxide (CO2) concentrations in office buildings in the Boston area. The authors speculated that the observed association was due to a causal effect associated with low dilution ventilation, perhaps increased airborne transmission of respiratory infections. This study was undertaken to explore this association. Methods We conducted an intervention study of indoor CO2 levels and sick leave among hourly office workers employed by a large corporation. Outdoor air supply rates were adjusted periodically to increase the range of CO2 concentrations. We recorded indoor CO2 concentrations …