Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Theses/Dissertations

2017

Institution
Keyword
Publication

Articles 1 - 24 of 24

Full-Text Articles in Life Sciences

Of Rats And Men, Thomas S. Walsh Dec 2017

Of Rats And Men, Thomas S. Walsh

Capstones

This capstone is a data-driven investigation into New York City's rat problem. By using publicly available government data to map rat activity in NYC, I identified several socio-economic variables that correlate with rat populations at the community district, borough, and city-scale. I used these findings (mainly that rat problems are linked to lower incomes) as the basis of an investigation, which includes interviews with residents, experts, and city officials. Prof. Bobby Corrigan, urban rodentologist and formerly with the NYC Department of Health criticizes the city's efforts for the first time on the record.

https://thomasseiyawalsh.wixsite.com/ratstone


Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell Dec 2017

Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Movement and habitat selection by Greater Sage-grouse (Centrocercus uropasianus) is of great interest to wildlife managers tasked with applying conservation measures for this iconic western species. Current technology has created small and lightweight GPS (Global Positioning Systems) transmitters that can be attached to sage-grouse. Using GIS software and statistical programs such as Program R, land managers can analyze GPS location data to assess how sage-grouse are geospatially interacting with their habitats. Within the Panguitch Sage-Grouse Management Area (SGMA) thousands of acres of land have been restored or manipulated to enhance sage-grouse habitat; this usually involves removal of pinyon pine …


Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick Dec 2017

Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick

Electronic Theses and Dissertations

Introduction: Differential scanning calorimetry (DSC) is used to determine thermally-induced conformational changes of biomolecules within a blood plasma sample. Recent research has indicated that DSC curves (or thermograms) may have different characteristics based on disease status and, thus, may be useful as a monitoring and diagnostic tool for some diseases. Since thermograms are curves measured over a range of temperature values, they are often considered as functional data. In this dissertation we propose and apply functional data analysis (FDA) techniques to analyze DSC data from the Lupus Family Registry and Repository (LFRR). The aim is to develop FDA methods to …


Modelling Bird Migration With Motus Data And Bayesian State-Space Models, Justin Baldwin Oct 2017

Modelling Bird Migration With Motus Data And Bayesian State-Space Models, Justin Baldwin

Masters Theses

Bird migration is a poorly-known yet important phenomenon, as understanding movement patterns of birds can inform conservation strategies and public health policy for animal-borne diseases. Recent advances in wildlife tracking technology, in particular the Motus system, have allowed researchers to track even small flying birds and insects with radio transmitters that weigh fractions of a gram. This system relies on a community-based distributed sensor network that detects tagged animals as they move through the detection nodes on journeys that range from small local movements to intercontinental migrations. The quantity of data generated by the Motus system is unprecedented, is on …


Multiple Testing Correction With Repeated Correlated Outcomes: Applications To Epigenetics, Katie Leap Oct 2017

Multiple Testing Correction With Repeated Correlated Outcomes: Applications To Epigenetics, Katie Leap

Masters Theses

Epigenetic changes (specifically DNA methylation) have been associated with adverse health outcomes; however, unlike genetic markers that are fixed over the lifetime of an individual, methylation can change. Given that there are a large number of methylation sites, measuring them repeatedly introduces multiple testing problems beyond those that exist in a static genetic context. Using simulations of epigenetic data, we considered different methods of controlling the false discovery rate. We considered several underlying associations between an exposure and methylation over time.

We found that testing each site with a linear mixed effects model and then controlling the false discovery rate …


A Tail-Based Test For Differential Expression Analysis And Pathway Analysis In Rna-Sequencing Data, Jiong Chen Aug 2017

A Tail-Based Test For Differential Expression Analysis And Pathway Analysis In Rna-Sequencing Data, Jiong Chen

Dissertations & Theses (Open Access)

RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and pathway analysis. Such data at the exon-level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for …


Genomic And Physiological Approaches To Improve Drought Tolerance In Soybean, Avjinder Kaler Aug 2017

Genomic And Physiological Approaches To Improve Drought Tolerance In Soybean, Avjinder Kaler

Graduate Theses and Dissertations

Drought stress is a major global constraint for crop production, and improving crop tolerance to drought is of critical importance. Direct selection of drought tolerance among genotypes for yield is limited because of low heritability, polygenic control, epistasis effects, and genotype by environment interactions. Crop physiology can play a major role for improving drought tolerance through the identification of traits associated with drought tolerance that can be used as indirect selection criteria in a breeding program. Carbon isotope ratio (δ13C, associated with water use efficiency), oxygen isotope ratio (δ18O, associated with transpiration), canopy temperature (CT), canopy wilting, and canopy coverage …


A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen Aug 2017

A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The health of freshwater aquatic systems, particularly stream networks, is mainly influenced by water temperature, which controls biological processes and influences species distributions and aquatic biodiversity. Thermal regimes of rivers are likely to change in the future, due to climate change and other anthropogenic impacts, and our ability to predict stream temperatures will be critical in understanding distribution shifts of aquatic biota. Spatial statistical network models take into account spatial relationships but have drawbacks, including high computation times and data pre-processing requirements. Machine learning techniques and generalized additive models (GAM) are promising alternatives to the SSN model. Two machine learning …


Identifying Three-Way Gene Interactions From Microarray Data Using Kolmogorov-Smirnov And Cross-Match Tests, Shubhashree Khadka Aug 2017

Identifying Three-Way Gene Interactions From Microarray Data Using Kolmogorov-Smirnov And Cross-Match Tests, Shubhashree Khadka

Graduate Theses and Dissertations

Human gene network is much more complex than just pairwise interaction among the genes. Zhang et al. [6] extracted microarray data from International Genomics Consortium (IGC), and presented the detection of three-way gene interactions in their paper using Fisher’s z-transformation test. Three-way gene interactions are closer than pairwise correlations in representing the complex gene structures. Additionally, it was more tractable than assessing four or more gene interactions. In this paper, we are simulating different models where Fisher’s test might not be as effective. Zhang et al.’s approach utilized Pearson’s correlation coefficients and involved detection of linear interactions only. Since gene …


Environmentally-Driven Variation In The Population Dynamics Of Gulf Menhaden (Brevoortia Patronus), Grant D. Adams Aug 2017

Environmentally-Driven Variation In The Population Dynamics Of Gulf Menhaden (Brevoortia Patronus), Grant D. Adams

Master's Theses

Gulf Menhaden (Brevoortia patronus) is an abundant forage fish distributed throughout the Northern Gulf of Mexico (NGOM). Gulf Menhaden support the second largest fishery, by weight, in the United States and represent a key linkage between upper and lower trophic levels. Variation in the population dynamics can, therefore, pose consequences for the ecology and economy in the NGOM. Here we aim to understand variation in the individual and population dynamics of Gulf Menhaden throughout ontogeny and how such variation relates to environmental processes. We utilized a suite of fishery-dependent and –independent, remote sensing, modeled, and in situ data …


Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu Jul 2017

Statistical Methods For High Dimensional Data Arising From Large Epidemiological Studies, Hui Xu

Doctoral Dissertations

In this thesis, we propose statistical models for addressing commonly encountered data types and study designs in large epidemiologic investigations aimed at understanding the molecular basis of complex disorders. The motivating applications come from diverse disease areas in Women's Health, including the study of type II diabetes in the Women's Health Initiative (WHI), invasive breast cancer in the Nurses' Health Study and the study of the metabolomic underpinnings of cardiovascular disease in the WHI. We have also put significant effort into making the implementation of the proposed methods accessible through freely available, user-friendly software packages in R. The first chapter …


Effects Of Ankle Weights On Metabolic Response And Muscle Activity On A Lower Body Positive Pressure Treadmill 2017, Saige Hupman May 2017

Effects Of Ankle Weights On Metabolic Response And Muscle Activity On A Lower Body Positive Pressure Treadmill 2017, Saige Hupman

Master's Theses

Lower body positive pressure (LBPP) treadmills are growing in popularity for rehabilitative use, as the benefits of exercising at partially supported body weight may induce faster recovery. It is unknown if there are certain practices that increase exercise intensity while maintaining positive effects of LBPP. Adding ankle weights when walking or running could increase intensity of rehabilitation programs while maintaining the comfort of supported body weight. PURPOSE: To measure metabolic response (VO2, RER, HR, Caloric expenditure), RPE, and lower limb electromyography (EMG) amplitudes of LBPP treadmill walking and running with and without ankle weights. METHODS: Sixteen participants (Age: 21.94 ± …


Denoising Tandem Mass Spectrometry Data, Felix Offei May 2017

Denoising Tandem Mass Spectrometry Data, Felix Offei

Electronic Theses and Dissertations

Protein identification using tandem mass spectrometry (MS/MS) has proven to be an effective way to identify proteins in a biological sample. An observed spectrum is constructed from the data produced by the tandem mass spectrometer. A protein can be identified if the observed spectrum aligns with the theoretical spectrum. However, data generated by the tandem mass spectrometer are affected by errors thus making protein identification challenging in the field of proteomics. Some of these errors include wrong calibration of the instrument, instrument distortion and noise. In this thesis, we present a pre-processing method, which focuses on the removal of noisy …


Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li May 2017

Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li

Dissertations & Theses (Open Access)

My dissertation is focused on quantitative methodology development and application for two important topics in translational and clinical cancer research.

The first topic was motivated by the challenge of applying transcriptome sequencing (RNA-seq) to formalin-fixation and paraffin-embedding (FFPE) tumor samples for reliable diagnostic development. We designed a biospecimen study to directly compare gene expression results from different protocols to prepare libraries for RNA-seq from human breast cancer tissues, with randomization to fresh-frozen (FF) or FFPE conditions. To comprehensively evaluate the FFPE RNA-seq data quality for expression profiling, we developed multiple computational methods for assessment, such as the uniformity and continuity …


Spatially Explicit Population Estimates Of The Florida Black Bear, Jacob Michael Humm May 2017

Spatially Explicit Population Estimates Of The Florida Black Bear, Jacob Michael Humm

Masters Theses

The Florida black bear (Ursus americanus floridanus) is currently comprised of 7 isolated subpopulations: Apalachicola, Eglin, Osceola, Ocala/St. Johns, Chassahowitzka, Highlands/Glades, and Big Cypress. The last statewide assessment of Florida black bear population dynamics was conducted by Simek et al. (2005) using traditional capture-markrecapture methods. The subspecies was removed from Florida’s List of State Threatened Species in 2012 contingent upon the formulation of a management plan that would maintain viable subpopulations of black bears in suitable habitat. Accurate population estimates for each of the remaining black bear subpopulations in Florida were needed to achieve the management goals of …


Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah May 2017

Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah

Electronic Theses and Dissertations

Despite considerable advances in high throughput technology over the last decade, new challenges have emerged related to the analysis, interpretation, and integration of high-dimensional data. The arrival of omics datasets has contributed to the rapid improvement of systems biology, which seeks the understanding of complex biological systems. Metabolomics is an emerging omics field, where mass spectrometry technologies generate high dimensional datasets. As advances in this area are progressing, the need for better analysis methods to provide correct and adequate results are required. While in other omics sectors such as genomics or proteomics there has and continues to be critical understanding …


Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane Apr 2017

Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane

Theses

Alzheimer Disease (AD) is difficult to diagnose by using genetic testing or other traditional methods. Unlike diseases with simple genetic risk components, there exists no single marker determining as to whether someone will develop AD. Furthermore, AD is highly heterogeneous and different subgroups of individuals develop the disease due to differing factors. Traditional diagnostic methods using perceivable cognitive deficiencies are often too little too late due to the brain having suffered damage from decades of disease progression. In order to observe AD at early stages prior to the observation of cognitive deficiencies, biomarkers with greater accuracy are required. By using …


A Simulation Of Anthropogenic Mammoth Extinction, Matthew Klapman Apr 2017

A Simulation Of Anthropogenic Mammoth Extinction, Matthew Klapman

Undergraduate Honors Papers

There are multiple hypotheses as to why the Columbian Mammoth (Mammuthus columbi) and other megafauna in North America went extinct relatively recently and relatively quickly. The most popular of which are disease, climate change, meteorite strikes, and over hunting by humans [2, 9]. There is evidence to show that a combination of factors contributed to the megafaunal extinction, but ”overkill” explores the idea that early humans migrated onto the continent and then hunted the mammoths and other megafauna to extinction. The overkill hypothesis was first proposed by anthropologist Paul Martin in 1973 [8]. Evidence from radiocarbon dating shows that the …


The In Vivo Effect Of Oil Palm Phenolics (Opp) In Atherogenic Diet Induced Rats Model Of Alzheimer’S Disease (Ad), Yan Wu Jan 2017

The In Vivo Effect Of Oil Palm Phenolics (Opp) In Atherogenic Diet Induced Rats Model Of Alzheimer’S Disease (Ad), Yan Wu

Wayne State University Dissertations

Alzheimer’s disease (AD) is the most common cause of dementia in the aging population. It is characterized by cognitive decline and deposition of ß-amyloid plaques in the hippocampus. It has been shown that hypercholesterolemia induced by high cholesterol diet is associated with AD development. Increased level of oxidative stress has also been observed in AD patients. An important strategy to treat or delay the impairment is based on dietary modification, using food supplements. OPP, a water soluble fraction from oil palm fruit, rich in phenolics has been found to possess significant antioxidant activities. Its beneficial effects on cardiovascular diseases, diabetes …


Statistical Analyses To Detect And Refine Genetic Associations With Neurodegenerative Diseases, Yuriko Katsumata Jan 2017

Statistical Analyses To Detect And Refine Genetic Associations With Neurodegenerative Diseases, Yuriko Katsumata

Theses and Dissertations--Epidemiology and Biostatistics

Dementia is a clinical state caused by neurodegeneration and characterized by a loss of function in cognitive domains and behavior. Alzheimer’s disease (AD) is the most common form of dementia. Although the amyloid β (Aβ) protein and hyperphosphorylated tau aggregates in the brain are considered to be the key pathological hallmarks of AD, the exact cause of AD is yet to be identified. In addition, clinical diagnoses of AD can be error prone. Many previous studies have compared the clinical diagnosis of AD against the gold standard of autopsy confirmation and shown substantial AD misdiagnosis Hippocampal sclerosis of aging (HS-Aging) …


Raman Spectroscopy And Chemometrics For Forensic Bloodstain Analysis : Species Differentiation, Donor Age Estimation, And Dating Of Bloodstains, Kyle C. Doty Jan 2017

Raman Spectroscopy And Chemometrics For Forensic Bloodstain Analysis : Species Differentiation, Donor Age Estimation, And Dating Of Bloodstains, Kyle C. Doty

Legacy Theses & Dissertations (2009 - 2024)

The field of forensic science is constantly growing, so the advancement of old and unreliable techniques is at the forefront of what will lead to future progress and improvement. Current methods for identification and analysis of bloodstains are underwhelming due to the insignificant amount of information provided in a destructive, unreliable, and unsafe manner. As is the purpose of this research, creating new methodologies that are rapid, nondestructive, robust, statistically reliable, and safe would significantly advance the way bloodstains are currently analyzed, while providing more useful and relevant information for investigations and criminal proceedings. Raman spectroscopy, along with advanced statistical …


Predictive Modeling Of Adolescent Cannabis Use From Multimodal Data, Philip Spechler Jan 2017

Predictive Modeling Of Adolescent Cannabis Use From Multimodal Data, Philip Spechler

Graduate College Dissertations and Theses

Predicting teenage drug use is key to understanding the etiology of substance abuse. However, classic predictive modeling procedures are prone to overfitting and fail to generalize to independent observations. To mitigate these concerns, cross-validated logistic regression with elastic-net regularization was used to predict cannabis use by age 16 from a large sample of fourteen year olds (N=1,319). High-dimensional data (p = 2,413) including parent and child psychometric data, child structural and functional MRI data, and genetic data (candidate single-nucleotide polymorphisms, "SNPs") collected at age 14 were used to predict the initiation of cannabis use (minimum six occasions) by age 16. …


Family-Based Association Studies Of Autism In Boys Via Facial-Feature Clusters, Luke Andrew Settles Jan 2017

Family-Based Association Studies Of Autism In Boys Via Facial-Feature Clusters, Luke Andrew Settles

Masters Theses

"Autism spectrum disorder (ASD) refers to a set of developmental disorders with varied attributes. Due to its substantial heterogeneity in terms of behavioral and clinical phenotypes, it is challenging to discern the genetic biomarkers behind ASD, even though the disease is known to be genetic in nature. This serves as a motivation to detect relationships between single nucleotide polymorphisms (SNPs) and a causal autism disease susceptibility locus (DSL) within more homogeneous subgroups. Recently, clinically meaningful subclassifications of ASD have been discovered utilizing facial features of prepubescent boys. Therefore, through the employment of data from 44 prepubertal Caucasian boys with ASD …


A Functional Data Analytic Approach For Region Level Differential Dna Methylation Detection, Mohamed Salem F. Milad Jan 2017

A Functional Data Analytic Approach For Region Level Differential Dna Methylation Detection, Mohamed Salem F. Milad

Doctoral Dissertations

"DNA methylation is an epigenetic modification that can alter gene expression without a DNA sequence change. The role of DNA methylation in biological processes and human health is important to understand, with many studies identifying associations between specific methylation patterns and diseases such as cancer. In mammals, DNA methylation almost always occurs when a methyl group attaches to a cytosine followed by a guanine (i.e. CpG dinucleotides) on the DNA sequence. Many statistical methods have been developed to test for a difference in DNA methylation levels between groups (e.g. healthy vs disease) at individual cytosines. Site level testing is often …