Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Massachusetts Amherst

2017

Discipline
Keyword
Publication
Publication Type

Articles 1 - 12 of 12

Full-Text Articles in Statistics and Probability

Deep Energy-Based Models For Structured Prediction, David Belanger Nov 2017

Deep Energy-Based Models For Structured Prediction, David Belanger

Doctoral Dissertations

We introduce structured prediction energy networks (SPENs), a flexible frame- work for structured prediction. A deep architecture is used to define an energy func- tion over candidate outputs and predictions are produced by gradient-based energy minimization. This deep energy captures dependencies between labels that would lead to intractable graphical models, and allows us to automatically discover discrim- inative features of the structured output. Furthermore, practitioners can explore a wide variety of energy function architectures without having to hand-design predic- tion and learning methods for each model. This is because all of our prediction and learning methods interact with the energy …


Juvenile River Herring In Freshwater Lakes: Sampling Approaches For Evaluating Growth And Survival, Matthew T. Devine Oct 2017

Juvenile River Herring In Freshwater Lakes: Sampling Approaches For Evaluating Growth And Survival, Matthew T. Devine

Masters Theses

River herring, collectively alewives (Alosa pseudoharengus) and blueback herring (A. aestivalis), have experienced substantial population declines over the past five decades due in large part to overfishing, combined with other sources of mortality, and disrupted access to critical freshwater spawning habitats. Anadromous river herring populations are currently assessed by counting adults in rivers during upstream spawning migrations, but no field-based assessment methods exist for estimating juvenile densities in freshwater nursery habitats. Counts of 4-year-old migrating adults are variable and prevent understanding about how mortality acts on different life stages prior to returning to spawn (e.g., juveniles …


Modelling Bird Migration With Motus Data And Bayesian State-Space Models, Justin Baldwin Oct 2017

Modelling Bird Migration With Motus Data And Bayesian State-Space Models, Justin Baldwin

Masters Theses

Bird migration is a poorly-known yet important phenomenon, as understanding movement patterns of birds can inform conservation strategies and public health policy for animal-borne diseases. Recent advances in wildlife tracking technology, in particular the Motus system, have allowed researchers to track even small flying birds and insects with radio transmitters that weigh fractions of a gram. This system relies on a community-based distributed sensor network that detects tagged animals as they move through the detection nodes on journeys that range from small local movements to intercontinental migrations. The quantity of data generated by the Motus system is unprecedented, is on …


Multiple Testing Correction With Repeated Correlated Outcomes: Applications To Epigenetics, Katie Leap Oct 2017

Multiple Testing Correction With Repeated Correlated Outcomes: Applications To Epigenetics, Katie Leap

Masters Theses

Epigenetic changes (specifically DNA methylation) have been associated with adverse health outcomes; however, unlike genetic markers that are fixed over the lifetime of an individual, methylation can change. Given that there are a large number of methylation sites, measuring them repeatedly introduces multiple testing problems beyond those that exist in a static genetic context. Using simulations of epigenetic data, we considered different methods of controlling the false discovery rate. We considered several underlying associations between an exposure and methylation over time.

We found that testing each site with a linear mixed effects model and then controlling the false discovery rate …


Information Metrics For Predictive Modeling And Machine Learning, Kostantinos Gourgoulias Jul 2017

Information Metrics For Predictive Modeling And Machine Learning, Kostantinos Gourgoulias

Doctoral Dissertations

The ever-increasing complexity of the models used in predictive modeling and data science and their use for prediction and inference has made the development of tools for uncertainty quantification and model selection especially important. In this work, we seek to understand the various trade-offs associated with the simulation of stochastic systems. Some trade-offs are computational, e.g., execution time of an algorithm versus accuracy of simulation. Others are analytical: whether or not we are able to find tractable substitutes for quantities of interest, e.g., distributions, ergodic averages, etc. The first two chapters of this thesis deal with the study of the …


Statistical Methods On Risk Management Of Extreme Events, Zijing Zhang Jul 2017

Statistical Methods On Risk Management Of Extreme Events, Zijing Zhang

Doctoral Dissertations

The goal of the dissertation is the investigation of financial risk analysis methodologies, using the schemes for extreme value modeling as well as techniques from copula modeling. Extreme value theory is concerned with probabilistic and statistical questions re- lated to unusual behavior or rare events. The subject has a rich mathematical theory and also a long tradition of applications in a variety of areas. We are interested in its application in risk management, with a focus on estimating and forcasting the Value-at-Risk of financial time series data. Extremal data are inherently scarce, thus making inference challenging. In order to obtain …


Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu Jul 2017

Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu

Doctoral Dissertations

In this thesis, we propose statistical models for addressing commonly encountered data types and study designs in large epidemiologic investigations aimed at understanding the molecular basis of complex disorders. The motivating applications come from diverse disease areas in Women's Health, including the study of type II diabetes in the Women's Health Initiative (WHI), invasive breast cancer in the Nurses' Health Study and the study of the metabolomic underpinnings of cardiovascular disease in the WHI. We have also put significant effort into making the implementation of the proposed methods accessible through freely available, user-friendly software packages in R. The first chapter …


Explorations Into Machine Learning Techniques For Precipitation Nowcasting, Aditya Nagarajan Mar 2017

Explorations Into Machine Learning Techniques For Precipitation Nowcasting, Aditya Nagarajan

Masters Theses

Recent advances in cloud-based big-data technologies now makes data driven solutions feasible for increasing numbers of scientific computing applications. One such data driven solution approach is machine learning where patterns in large data sets are brought to the surface by finding complex mathematical relationships within the data. Nowcasting or short-term prediction of rainfall in a given region is an important problem in meteorology. In this thesis we explore the nowcasting problem through a data driven approach by formulating it as a machine learning problem.

State-of-the-art nowcasting systems today are based on numerical models which describe the physical processes leading to …


Inference In Networking Systems With Designed Measurements, Chang Liu Mar 2017

Inference In Networking Systems With Designed Measurements, Chang Liu

Doctoral Dissertations

Networking systems consist of network infrastructures and the end-hosts have been essential in supporting our daily communication, delivering huge amount of content and large number of services, and providing large scale distributed computing. To monitor and optimize the performance of such networking systems, or to provide flexible functionalities for the applications running on top of them, it is important to know the internal metrics of the networking systems such as link loss rates or path delays. The internal metrics are often not directly available due to the scale and complexity of the networking systems. This motivates the techniques of inference …


Inference From Network Data In Hard-To-Reach Populations, Isabelle Beaudry Mar 2017

Inference From Network Data In Hard-To-Reach Populations, Isabelle Beaudry

Doctoral Dissertations

The objective of this thesis is to develop methods to make inference about the prevalence of an outcome of interest in hard-to-reach populations. The proposed methods address issues specific to the survey strategies employed to access those populations. One of the common sampling methodology used in this context is respondent-driven sampling (RDS). Under RDS, the network connecting members of the target population is used to uncover the hidden members. Specialized techniques are then used to make inference from the data collected in this fashion. Our first objective is to correct traditional RDS prevalence estimators and their associated uncertainty estimators for …


White Blood Cell Dna Methylation And Risk Of Breast Cancer In The Prostate, Lung, Colorectal, And Ovarian Cancer Screening Trial (Plco), Susan R. Sturgeon, J. Richard Pilsner, Kathleen F. Arcaro, Kaoru Ikuma, Haotian Wu, Soon-Mi Kim, Nayha Chopra-Tandon, Adam R. Karpf, Regina G. Ziegler, Catherine Schairer, Raji Balasubramanian, David A. Reckhow Jan 2017

White Blood Cell Dna Methylation And Risk Of Breast Cancer In The Prostate, Lung, Colorectal, And Ovarian Cancer Screening Trial (Plco), Susan R. Sturgeon, J. Richard Pilsner, Kathleen F. Arcaro, Kaoru Ikuma, Haotian Wu, Soon-Mi Kim, Nayha Chopra-Tandon, Adam R. Karpf, Regina G. Ziegler, Catherine Schairer, Raji Balasubramanian, David A. Reckhow

Biostatistics and Epidemiology Faculty Publications Series

Background

Several studies have suggested that global DNA methylation in circulating white blood cells (WBC) is associated with breast cancer risk.

Methods

To address conflicting results and concerns that the findings for WBC DNA methylation in some prior studies may reflect disease effects, we evaluated the relationship between global levels of WBC DNA methylation in white blood cells and breast cancer risk in a case-control study nested within the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO) cohort. A total of 428 invasive breast cancer cases and 419 controls, frequency matched on age at entry (55–59, 60–64, 65–69, ≥70 …


Alcohol Consumption And Breast Tumor Gene Expression, Jun Wang, Yujing J. Heng, A. Heather Eliassen, Rull M. Tamimi, Aditi Hazra, Vincent J. Carey, Christine B. Ambrosone, Victor P. De Andrade, Adam Brufsky, Fergus J. Couch, Tari A. King, Francesmary Modugno, Celine M. Vachon, David J. Hunter, Andrew H. Beck, Susan E. Hankinson Jan 2017

Alcohol Consumption And Breast Tumor Gene Expression, Jun Wang, Yujing J. Heng, A. Heather Eliassen, Rull M. Tamimi, Aditi Hazra, Vincent J. Carey, Christine B. Ambrosone, Victor P. De Andrade, Adam Brufsky, Fergus J. Couch, Tari A. King, Francesmary Modugno, Celine M. Vachon, David J. Hunter, Andrew H. Beck, Susan E. Hankinson

Biostatistics and Epidemiology Faculty Publications Series

Background

Alcohol consumption is an established risk factor for breast cancer and the association generally appears stronger among estrogen receptor (ER)-positive tumors. However, the biological mechanisms underlying this association are not completely understood.

Methods

We analyzed messenger RNA (mRNA) microarray data from both invasive breast tumors (N = 602) and tumor-adjacent normal tissues (N = 508) from participants diagnosed with breast cancer in the Nurses’ Health Study (NHS) and NHSII. Multivariable linear regression, controlling for other known breast cancer risk factors, was used to identify differentially expressed genes by pre-diagnostic alcohol intake. For pathway analysis, we performed gene …