Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Kriging (2)
- AUC (1)
- Antimicrobial Resistance (1)
- Bayesian (1)
- Bayesian Maximum Entropy (1)
-
- Beta-binomial (1)
- Bi-dimensional Regression (1)
- Binary response (1)
- Causal structure modeling (1)
- Design of Experiments (1)
- Diagnostic tests (1)
- EM algorithm (1)
- Fake news (1)
- Full Information Maximum Likelihood (1)
- GWAS (1)
- Genomic Prediction (1)
- Group testing (1)
- H measure (1)
- HMM (1)
- Hidden Markov Model (1)
- Infectious disease (1)
- Lasso (1)
- Level of agreement (1)
- Microbiome (1)
- Multidimensional Scaling (1)
- Natural language processing (1)
- Network (1)
- Optimization (1)
- Orthogonal (1)
- Overidentification (1)
Articles 1 - 14 of 14
Full-Text Articles in Physical Sciences and Mathematics
Exploring Experimental Design And Multivariate Analysis Techniques For Evaluating Community Structure Of Bacteria In Microbiome Data, Kelsey Karnik
Exploring Experimental Design And Multivariate Analysis Techniques For Evaluating Community Structure Of Bacteria In Microbiome Data, Kelsey Karnik
Department of Statistics: Dissertations, Theses, and Student Work
The gut microbiome plays a crucial role in human health, and by working collaboratively with microbiologists, we aim to further our understanding of the human gut and its impact on human health. Promoting a diverse microbiome is emphasized throughout microbiology literature, and involving a statistician in designing experiments to relate gut bacteria and some measured health outcome is crucial for ensuring valid and accurate results. By adopting new experimental design and analysis methods, researchers can begin to gain a deeper understanding of how the genetics of our food affect the composition of taxa within the gut microbiome. This dissertation is …
Examining The Effect Of Word Embeddings And Preprocessing Methods On Fake News Detection, Jessica Hauschild
Examining The Effect Of Word Embeddings And Preprocessing Methods On Fake News Detection, Jessica Hauschild
Department of Statistics: Dissertations, Theses, and Student Work
The words people choose to use hold a lot of power, whether that be in spreading truth or deception. As listeners and readers, we do our best to understand how words are being used. There are many current methods in computer science literature attempting to embed words into numerical information for statistical analyses. Some of these embedding methods, such as Bag of Words, treat words as independent, while others, such as Word2Vec, attempt to gain information about the context of words. It is of interest to compare how well these various methods of translating text into numerical data work specifically …
Statistical Methodology To Establish A Benchmark For Evaluating Antimicrobial Resistance Genes Through Real Time Pcr Assay, Enakshy Dutta
Statistical Methodology To Establish A Benchmark For Evaluating Antimicrobial Resistance Genes Through Real Time Pcr Assay, Enakshy Dutta
Department of Statistics: Dissertations, Theses, and Student Work
Novel diagnostic tests are usually compared with gold standard tests for evaluating diagnostic accuracy. For assessing antimicrobial resistance (AMR) to bovine respiratory disease (BRD) pathogens, phenotypic broth microdilution method is used as gold standard (GS). The objective of the thesis is to evaluate the optimal cycle threshold (Ct) generated by real-time polymerase chain reaction (rtPCR) to genes that confer resistance that will translate to the phenotypic classification of AMR. Data from two different methodologies are assessed to identify Ct that will discriminate between resistance (R) and susceptibility (S). First, the receiver operating characteristic (ROC) curve was used to determine the …
Using Stability To Select A Shrinkage Method, Dean Dustin
Using Stability To Select A Shrinkage Method, Dean Dustin
Department of Statistics: Dissertations, Theses, and Student Work
Shrinkage methods are estimation techniques based on optimizing expressions to find which variables to include in an analysis, typically a linear regression. The general form of these expressions is the sum of an empirical risk plus a complexity penalty based on the number of parameters. Many shrinkage methods are known to satisfy an ‘oracle’ property meaning that asymptotically they select the correct variables and estimate their coefficients efficiently. In Section 1.2, we show oracle properties in two general settings. The first uses a log likelihood in place of the empirical risk and allows a general class of penalties. The second …
Group Testing Identification: Objective Functions, Implementation, And Multiplex Assays, Brianna D. Hitt
Group Testing Identification: Objective Functions, Implementation, And Multiplex Assays, Brianna D. Hitt
Department of Statistics: Dissertations, Theses, and Student Work
Group testing is the process of combining items into groups to test for a binary characteristic. One of its most widely used applications is infectious disease testing. In this context, specimens (e.g., blood, urine) are amalgamated into groups and tested. For groups that test positive, there are many algorithmic retesting procedures available to identify positive individuals. The appeal of group testing is that the overall number of tests needed is significantly less than for individual testing when disease prevalence is small and an appropriate algorithm is chosen. Group testing has a number of applications beyond infectious disease testing, such as …
Optimal Design For A Causal Structure, Zaher Kmail
Optimal Design For A Causal Structure, Zaher Kmail
Department of Statistics: Dissertations, Theses, and Student Work
Linear models and mixed models are important statistical tools. But in many natural phenomena, there is more than one endogenous variable involved and these variables are related in a sophisticated way. Structural Equation Modeling (SEM) is often used to model the complex relationships between the endogenous and exogenous variables. It was first implemented in research to estimate the strength and direction of direct and indirect effects among variables and to measure the relative magnitude of each causal factor.
Historically, traditional optimal design theory focuses on univariate linear, nonlinear, and mixed models. There is no current literature on the subject of …
Methods To Account For Breed Composition In A Bayesian Gwas Method Which Utilizes Haplotype Clusters, Danielle F. Wilson-Wells
Methods To Account For Breed Composition In A Bayesian Gwas Method Which Utilizes Haplotype Clusters, Danielle F. Wilson-Wells
Department of Statistics: Dissertations, Theses, and Student Work
In livestock, prediction of an animal’s genetic merit using genomic information is becoming increasingly common. The models used to make these predictions typically assume that we are sampling from a homogeneous population. However, in both commercial and experimental populations the sire and dam of an individual may be a mixture of different breeds. Haplotype models can capture this population structure.
Two models based on breed specific haplotype clusters where developed to account for differences across multiple breeds. The first model utilizes the breed composition of the individual, while the second utilizes the breed composition from the sire and dam. Haplotype …
Beta-Binomial Kriging: A New Approach To Modeling Spatially Correlated Proportions, Aimee Schwab
Beta-Binomial Kriging: A New Approach To Modeling Spatially Correlated Proportions, Aimee Schwab
Department of Statistics: Dissertations, Theses, and Student Work
Spatially correlated count data sets appear often in applied data analysis problems, but there is little consensus in the literature about how best to analyze the data. The two prevailing approaches provide accurate parameter estimates and predictions, at the cost of model interpretability and simplicity. This dissertation will present a new approach to modeling spatially correlated binomial observations: beta-binomial kriging. The model proposed here is a modified form of spatial kriging which assumes the data are generated from a correlated beta-binomial distribution. Given this assumption, the spatial parameters and predicted values can be estimated using simple matrix algebra. Beta-binomial kriging …
A New Approach To Modeling Multivariate Time Series On Multiple Temporal Scales, Tucker Zeleny
A New Approach To Modeling Multivariate Time Series On Multiple Temporal Scales, Tucker Zeleny
Department of Statistics: Dissertations, Theses, and Student Work
In certain situations, observations are collected on a multivariate time series at a certain temporal scale. However, there may also exist underlying time series behavior on a larger temporal scale that is of interest. Often times, identifying the behavior of the data over the course of the larger scale is the key objective. Because this large scale trend is not being directly observed, describing the trends of the data on this scale can be more difficult. To further complicate matters, the observed data on the smaller time scale may be unevenly spaced from one larger scale time point to the …
New Statistical Methods For Analysis Of Historical Data From Wildlife Populations, Trevor Hefley
New Statistical Methods For Analysis Of Historical Data From Wildlife Populations, Trevor Hefley
Department of Statistics: Dissertations, Theses, and Student Work
Wildlife biologists, many times with the help of ordinary citizens, have developed and maintained long-term datasets for monitoring the status of wildlife populations. These datasets can range from a collection of citizen-reported sightings of a rare species, to datasets collected by biologists using standardized methods. The commonality is that these datasets span a temporal and spatial scale that is beyond the scope of most scientific studies. Ensuring the continued persistence of wildlife populations requires predictions of the impact of human actions. Regardless if the predictions are quantitative or qualitative, the best we can do is use the past data to …
A Test For Detecting Changes In Closed Networks Based On The Number Of Communications Between Nodes, Christopher S. Wichman
A Test For Detecting Changes In Closed Networks Based On The Number Of Communications Between Nodes, Christopher S. Wichman
Department of Statistics: Dissertations, Theses, and Student Work
This dissertation presents a formal method for detecting changes in a closed communications network based on an “abnormal” shift in the number of communications between some of the nodes. The method relies on the analyst’s ability to define the network of interest; capture the number of communications between nodes; and to establish a history of normal communications flow between nodes over fixed intervals of time. A metric multi-dimensional scaling technique is then used to represent the network at each time interval with a k-dimensional (k = 1, 2, …) configuration. The affine bi-dimensional regression coefficient of determination (aR2) …
Informative Retesting For Hierarchical Group Testing, Michael S. Black
Informative Retesting For Hierarchical Group Testing, Michael S. Black
Department of Statistics: Dissertations, Theses, and Student Work
Group testing is the process of pooling samples (e.g., blood, chemical compounds) from multiple sources and testing the pooled material for some binary characteristic. It is used in pathogen screening for humans and animals, drug discovery studies, electrical systems testing, and many other applications. Group testing has traditionally been used for two main types of investigations: 1) the identification of positive specimens and 2) the estimation of a characteristic’s prevalence in a population. This dissertation focuses on the identification process. We propose new identification procedures that exploit the heterogeneity among samples in order to reduce the number of tests needed …
A Comparison Of Spatial Prediction Techniques Using Both Hard And Soft Data, Megan L. Liedtke Tesar
A Comparison Of Spatial Prediction Techniques Using Both Hard And Soft Data, Megan L. Liedtke Tesar
Department of Statistics: Dissertations, Theses, and Student Work
The overall goal of this research, which is common to most spatial studies, is to predict a value of interest at an unsampled location based on measured values at nearby sampled locations. To accomplish this goal, ordinary kriging can be used to obtain the best linear unbiased predictor. However, there is often a large amount of variability surrounding the measurements of environmental variables, and traditional prediction methods, such as ordinary kriging, do not account for an attribute with more than one level of uncertainty. This dissertation addresses this limitation by introducing a new methodology called weighted kriging. This prediction technique …
Sequence Comparison And Stochastic Model Based On Multi-Order Markov Models, Xiang Fang
Sequence Comparison And Stochastic Model Based On Multi-Order Markov Models, Xiang Fang
Department of Statistics: Dissertations, Theses, and Student Work
This dissertation presents two statistical methodologies developed on multi-order Markov models. First, we introduce an alignment-free sequence comparison method, which represents a sequence using a multi-order transition matrix (MTM). The MTM contains information of multi-order dependencies and provides a comprehensive representation of the heterogeneous composition within a sequence. Based on the MTM, a distance measure is developed for pair-wise comparison of sequences. The new method is compared with the traditional maximum likelihood (ML) method, the complete composition vector (CCV) method and the improved version of the complete composition vector (ICCV) method using simulated sequences. We further illustrate the application of …