Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Biostatistics

2018

Institution
Keyword
Publication
Publication Type

Articles 1 - 17 of 17

Full-Text Articles in Applied Statistics

A Generative Statistical Approach For Data Classification In A Biologically Inspired Design Tool, Marvin Manuel Arroyo Rujano Dec 2018

A Generative Statistical Approach For Data Classification In A Biologically Inspired Design Tool, Marvin Manuel Arroyo Rujano

Graduate Theses and Dissertations

The objective of the research this thesis describes is to find a way to classify text-based descriptions of biological adaption to support Biologically Inspired design. Biologically inspired design is a fairly new field with ongoing research. There are different tools to assist designers and biologists in bio-inspired design. Some of the most common are BioTRIZ and AskNature. In recent years, more tools have been proposed to aid and make research in the field easier, for example, the Biologically Inspired Adaptive System Design (BIASD) tool. This tool was designed with the goal of helping designers in early design stages generate more …


Spatio-Temporal Reconstruction Of Remote Sensing Observations, Kamrul Khan Dec 2018

Spatio-Temporal Reconstruction Of Remote Sensing Observations, Kamrul Khan

Graduate Theses and Dissertations

The USDA Forest Service aims to use satellite imagery for monitoring and predicting changes in forest conditions over time within the country. We specifically focus on a 230, 400 hectares region in north-central Wisconsin between 2003 - 2012. The auxiliary data collected from the satellite imagery of this region are relatively dense in space and time and can be used to efficiently predict how the forest condition changed over that decade. However, these records have a significant proportion of missing values due to weather conditions and system failures. To fill in these missing values, we build spaciotemporal models based on …


Bias Assessment And Reduction In Kernel Smoothing, Wenkai Ma Nov 2018

Bias Assessment And Reduction In Kernel Smoothing, Wenkai Ma

Electronic Thesis and Dissertation Repository

When performing local polynomial regression (LPR) with kernel smoothing, the choice of the smoothing parameter, or bandwidth, is critical. The performance of the method is often evaluated using the Mean Square Error (MSE). Bias and variance are two components of MSE. Kernel methods are known to exhibit varying degrees of bias. Boundary effects and data sparsity issues are two potential problems to watch for. There is a need for a tool to visually assess the potential bias when applying kernel smooths to a given scatterplot of data. In this dissertation, we propose pointwise confidence intervals for bias and demonstrate a …


Analysis Of Ranked Gene Tree Probability Distributions Under The Coalescent Process For Detecting Anomaly Zones, Anastasiia Kim Nov 2018

Analysis Of Ranked Gene Tree Probability Distributions Under The Coalescent Process For Detecting Anomaly Zones, Anastasiia Kim

Shared Knowledge Conference

In phylogenetic studies, gene trees are used to reconstruct species tree. Under the multispecies coalescent model, gene trees topologies may differ from that of species trees. The incorrect gene tree topology (one that does not match the species tree) that is more probable than the correct one is termed anomalous gene tree (AGT). Species trees that can generate such AGTs are said to be in the anomaly zone (AZ). In this region, the method of choosing the most common gene tree as the estimate of the species tree will be inconsistent and will converge to an incorrect species tree when …


Resource Assessment Report Temperate Demersal Elasmobranch Resource Of Western Australia, Matias Braccini, Nick Blay, S. A. Hesp, Brett Molony Nov 2018

Resource Assessment Report Temperate Demersal Elasmobranch Resource Of Western Australia, Matias Braccini, Nick Blay, S. A. Hesp, Brett Molony

Fisheries research reports

This document provides a cumulative description and assessment of the TDER and all of the fishing activities (i.e. fisheries / fishing sectors) affecting this resource in WA. Future Resource Assessment Reports will assess the Statewide Sharks and Rays Resource. The report is focused on the temperate indicator species (whiskery, gummy, dusky and sandbar sharks) used to assess the suites of demersal sharks and rays that comprise this resource. These species are primarily captured by demersal gillnets used in the TDGDLF that operate in the West Coast and South Coast Bioregions. For the North Coast bioregion, no commercial fishing for sharks …


Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak Oct 2018

Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data, Caroline Kusiak

Masters Theses

Dengue fever affects over 390 million people annually worldwide and is of particu- lar concern in Southeast Asia where it is one of the leading causes of hospitalization. Modeling trends in dengue occurrence can provide valuable information to Public Health officials, however many challenges arise depending on the data available. In Thailand, reporting of dengue cases is often delayed by more than 6 weeks, and a small fraction of cases may not be reported until over 11 months after they occurred. This study shows that incorporating data on Google Search trends can improve dis- ease predictions in settings with severely …


Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor Aug 2018

Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor

Electronic Theses and Dissertations

Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …


Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin Jul 2018

Regression Analysis For Ordinal Outcomes In Matched Study Design: Applications To Alzheimer's Disease Studies, Elizabeth Austin

Masters Theses

Alzheimer's Disease (AD) affects nearly 5.4 million Americans as of 2016 and is the most common form of dementia. The disease is characterized by the presence of neurofibrillary tangles and amyloid plaques [1]. The amount of plaques are measured by Braak stage, post-mortem. It is known that AD is positively associated with hypercholesterolemia [16]. As statins are the most widely used cholesterol-lowering drug, there may be associations between statin use and AD. We hypothesize that those who use statins, specifically lipophilic statins, are more likely to have a low Braak stage in post-mortem analysis.

In order to address this hypothesis, …


Australian Herring And West Australian Salmon Scientific Workshop Report, October 2017, Department Of Primary Industries And Regional Development, Western Australia Jul 2018

Australian Herring And West Australian Salmon Scientific Workshop Report, October 2017, Department Of Primary Industries And Regional Development, Western Australia

Fisheries research reports

No abstract provided.


Analysis Challenges For High Dimensional Data, Bangxin Zhao Apr 2018

Analysis Challenges For High Dimensional Data, Bangxin Zhao

Electronic Thesis and Dissertation Repository

In this thesis, we propose new methodologies targeting the areas of high-dimensional variable screening, influence measure and post-selection inference. We propose a new estimator for the correlation between the response and high-dimensional predictor variables, and based on the estimator we develop a new screening technique termed Dynamic Tilted Current Correlation Screening (DTCCS) for high dimensional variables screening. DTCCS is capable of picking up the relevant predictor variables within a finite number of steps. The DTCCS method takes the popular used sure independent screening (SIS) method and the high-dimensional ordinary least squares projection (HOLP) approach as its special cases.

Two methods …


Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia Apr 2018

Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia

Statistical Science Theses and Dissertations

This research contains two topics: (1) PBNPA: a permutation-based non-parametric analysis of CRISPR screen data; (2) RCRnorm: an integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data from FFPE samples.

Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single spe- cific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms. We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level …


The Impact Of Truncating Data On The Predictive Ability For Single-Step Genomic Best Linear Unbiased Prediction, Jeremy T. Howard, Thomas A. Rathje, Caitlyn E. Bruns, Danielle F. Wilson-Wells, Stephen D. Kachman, Matthew L. Spangler Jan 2018

The Impact Of Truncating Data On The Predictive Ability For Single-Step Genomic Best Linear Unbiased Prediction, Jeremy T. Howard, Thomas A. Rathje, Caitlyn E. Bruns, Danielle F. Wilson-Wells, Stephen D. Kachman, Matthew L. Spangler

Department of Animal Science: Faculty Publications

Simulated and swine industry data sets were utilized to assess the impact of removing older data on the predictive ability of selection candidate estimated breeding values (EBV) when using single-step genomic best linear unbiased prediction (ssGBLUP). Simulated data included thirty replicates designed to mimic the structure of swine data sets. For the simulated data, varying amounts of data were truncated based on the number of ancestral generations back from the selection candidates. The swine data sets consisted of phenotypic and genotypic records for three traits across two breeds on animals born from 2003 to 2017. Phenotypes and genotypes were iteratively …


Improved Methods And Selecting Classification Types For Time-Dependent Covariates In The Marginal Analysis Of Longitudinal Data, I-Chen Chen Jan 2018

Improved Methods And Selecting Classification Types For Time-Dependent Covariates In The Marginal Analysis Of Longitudinal Data, I-Chen Chen

Theses and Dissertations--Epidemiology and Biostatistics

Generalized estimating equations (GEE) are popularly utilized for the marginal analysis of longitudinal data. In order to obtain consistent regression parameter estimates, these estimating equations must be unbiased. However, when certain types of time-dependent covariates are presented, these equations can be biased unless an independence working correlation structure is employed. Moreover, in this case regression parameter estimation can be very inefficient because not all valid moment conditions are incorporated within the corresponding estimating equations. Therefore, approaches using the generalized method of moments or quadratic inference functions have been proposed for utilizing all valid moment conditions. However, we have found that …


Step-Selection Functions For Modeling Animal Movement -- Case Study: African Buffalo, Maia Adar Jan 2018

Step-Selection Functions For Modeling Animal Movement -- Case Study: African Buffalo, Maia Adar

CMC Senior Theses

Understanding what factors influence wildlife movement allows landscape planners to make informed decisions that benefit both animals and humans. New quantitative methods, such as step-selection functions, provide valuable objective analyses of wildlife connectivity. This paper provides a framework for creating a step-selection function and demonstrates its use in a case study. The first section provides a general introduction about wildlife connectivity research. The second section explains the math behind the step-selection function using a simple example. The last section gives the results of a step-selection model for African buffalo in the Kavango Zambezi Transfrontier Conservation Area. Buffalo were found to …


Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry Jan 2018

Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry

Theses and Dissertations

The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and …


Joint Analysis Of Multiple Phenotypes In Association Studies, Xiaoyu Liang Jan 2018

Joint Analysis Of Multiple Phenotypes In Association Studies, Xiaoyu Liang

Dissertations, Master's Theses and Master's Reports

Genome-wide association studies (GWAS) have become a very effective research tool to identify genetic variants of underlying various complex diseases. In spite of the success of GWAS in identifying thousands of reproducible associations between genetic variants and complex disease, in general, the association between genetic variants and a single phenotype is usually weak. It is increasingly recognized that joint analysis of multiple phenotypes can be potentially more powerful than the univariate analysis, and can shed new light on underlying biological mechanisms of complex diseases. Therefore, developing statistical methods to test for genetic association with multiple phenotypes has become increasingly important. …


Improved Standard Error Estimation For Maintaining The Validities Of Inference In Small-Sample Cluster Randomized Trials And Longitudinal Studies, Whitney Ford Tanner Jan 2018

Improved Standard Error Estimation For Maintaining The Validities Of Inference In Small-Sample Cluster Randomized Trials And Longitudinal Studies, Whitney Ford Tanner

Theses and Dissertations--Epidemiology and Biostatistics

Data arising from Cluster Randomized Trials (CRTs) and longitudinal studies are correlated and generalized estimating equations (GEE) are a popular analysis method for correlated data. Previous research has shown that analyses using GEE could result in liberal inference due to the use of the empirical sandwich covariance matrix estimator, which can yield negatively biased standard error estimates when the number of clusters or subjects is not large. Many techniques have been presented to correct this negative bias; However, use of these corrections can still result in biased standard error estimates and thus test sizes that are not consistently at their …