Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Missing data (2)
- Model Selection (2)
- Recurrent Neural Networks (2)
- AIC (1)
- Average Causal Effect (1)
-
- Batch Normalization (1)
- Bayesian Adjustment for Confounding (1)
- Bayesian modeling (1)
- Bayesian nonparametric model (1)
- Biomarker (1)
- Bootstrap Calibration (1)
- Central expectile subspace (1)
- Central mean subspace (1)
- Central quantile subspace (1)
- Central subspace (1)
- Clustered data (1)
- Confidence Interval (1)
- Confidence sets (1)
- Convolutional Neural Networks (1)
- Covariate Adjustment (1)
- Deep Learning (1)
- Degrees of Freedom (1)
- Differential abundance analysis (1)
- Empirical distribution (1)
- Equivariant Hilbert series (1)
- Exploding Gradients (1)
- Gene expression (1)
- Generalized Fiducial Inference (1)
- Hierarchical models (1)
- Lack-of-fit (1)
Articles 1 - 14 of 14
Full-Text Articles in Physical Sciences and Mathematics
Measuring Variability In Model Performance Measures, Matthew Rutledge
Measuring Variability In Model Performance Measures, Matthew Rutledge
Theses and Dissertations--Statistics
As data become increasingly available, statisticians are confronted with both larger sample sizes and larger numbers of predictors. While both of these factors are beneficial in building better predictive models and allowing for better inference, models can become difficult to interpret and often include variables of little practical significance. This dissertation provides methods that assist model builders to better understand and select from a collection of candidate models. We study the asymptotic distribution of AIC and propose a graphical tool to assist practitioners in comparing and contrasting candidate models. Real-world examples show how this graphic might be used and a …
Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu
Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu
Theses and Dissertations--Statistics
A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …
Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou
Statistical Intervals For Various Distributions Based On Different Inference Methods, Yixuan Zou
Theses and Dissertations--Statistics
Statistical intervals (e.g., confidence, prediction, or tolerance) are widely used to quantify uncertainty, but complex settings can create challenges to obtain such intervals that possess the desired properties. My thesis will address diverse data settings and approaches that are shown empirically to have good performance. We first introduce a focused treatment on using a single-layer bootstrap calibration to improve the coverage probabilities of two-sided parametric tolerance intervals for non-normal distributions. We then turn to zero-inflated data, which are commonly found in, among other areas, pharmaceutical and quality control applications. However, the inference problem often becomes difficult in the presence of …
Algebraic And Geometric Properties Of Hierarchical Models, Aida Maraj
Algebraic And Geometric Properties Of Hierarchical Models, Aida Maraj
Theses and Dissertations--Mathematics
In this dissertation filtrations of ideals arising from hierarchical models in statistics related by a group action are are studied. These filtrations lead to ideals in polynomial rings in infinitely many variables, which require innovative tools. Regular languages and finite automata are used to prove and explicitly compute the rationality of some multivariate power series that record important quantitative information about the ideals. Some work regarding Markov bases for non-reducible models is shown, together with advances in the polyhedral geometry of binary hierarchical models.
Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang
Bayesian Kinetic Modeling For Tracer-Based Metabolomic Data, Xu Zhang
Theses and Dissertations--Statistics
Kinetic modeling of the time dependence of metabolite concentrations including the unstable isotope labeled species is an important approach to simulate metabolic pathway dynamics. It is also essential for quantitative metabolic flux analysis using tracer data. However, as the metabolic networks are complex including extensive compartmentation and interconnections, the parameter estimation for enzymes that catalyze individual reactions needed for kinetic modeling is challenging. As the pa- rameter space is large and multi-dimensional while kinetic data are comparatively sparse, the estimation procedure (especially the point estimation methods) often en- counters multiple local maximum such that standard maximum likelihood methods may yield …
Moment Kernels For T-Central Subspace, Weihang Ren
Moment Kernels For T-Central Subspace, Weihang Ren
Theses and Dissertations--Statistics
The T-central subspace allows one to perform sufficient dimension reduction for any statistical functional of interest. We propose a general estimator using a third moment kernel to estimate the T-central subspace. In particular, in this dissertation we develop sufficient dimension reduction methods for the central mean subspace via the regression mean function and central subspace via Fourier transform, central quantile subspace via quantile estimator and central expectile subsapce via expectile estima- tor. Theoretical results are established and simulation studies show the advantages of our proposed methods.
Simultaneous Tolerance Intervals For Response Surface And Mixture Designs Using The Adjusted Product Set Method, Aisaku Nakamura
Simultaneous Tolerance Intervals For Response Surface And Mixture Designs Using The Adjusted Product Set Method, Aisaku Nakamura
Theses and Dissertations--Statistics
Various methods for constructing simultaneous tolerance intervals for regression models have been developed over the years, but all of them can be shown to be conservative. In this thesis, extensive simulations are conducted to evaluate the degree of conservatism with respect to their coverage probabilities. A new strategy to fit simultaneous tolerance intervals on linear models is proposed by modifying an existing method, which we call the adjusted product set (APS) method. The APS method will also be used to construct simultaneous tolerance bands on response surface and mixture designs.
Unitary And Symmetric Structure In Deep Neural Networks, Kehelwala Dewage Gayan Maduranga
Unitary And Symmetric Structure In Deep Neural Networks, Kehelwala Dewage Gayan Maduranga
Theses and Dissertations--Mathematics
Recurrent neural networks (RNNs) have been successfully used on a wide range of sequential data problems. A well-known difficulty in using RNNs is the vanishing or exploding gradient problem. Recently, there have been several different RNN architectures that try to mitigate this issue by maintaining an orthogonal or unitary recurrent weight matrix. One such architecture is the scaled Cayley orthogonal recurrent neural network (scoRNN), which parameterizes the orthogonal recurrent weight matrix through a scaled Cayley transform. This parametrization contains a diagonal scaling matrix consisting of positive or negative one entries that can not be optimized by gradient descent. Thus the …
Orthogonal Recurrent Neural Networks And Batch Normalization In Deep Neural Networks, Kyle Eric Helfrich
Orthogonal Recurrent Neural Networks And Batch Normalization In Deep Neural Networks, Kyle Eric Helfrich
Theses and Dissertations--Mathematics
Despite the recent success of various machine learning techniques, there are still numerous obstacles that must be overcome. One obstacle is known as the vanishing/exploding gradient problem. This problem refers to gradients that either become zero or unbounded. This is a well known problem that commonly occurs in Recurrent Neural Networks (RNNs). In this work we describe how this problem can be mitigated, establish three different architectures that are designed to avoid this issue, and derive update schemes for each architecture. Another portion of this work focuses on the often used technique of batch normalization. Although found to be successful …
Nonparametric Analysis Of Clustered And Multivariate Data, Yue Cui
Nonparametric Analysis Of Clustered And Multivariate Data, Yue Cui
Theses and Dissertations--Statistics
In this dissertation, we investigate three distinct but interrelated problems for nonparametric analysis of clustered data and multivariate data in pre-post factorial design.
In the first project, we propose a nonparametric approach for one-sample clustered data in pre-post intervention design. In particular, we consider the situation where for some clusters all members are only observed at either pre or post intervention but not both. This type of clustered data is referred to us as partially complete clustered data. Unlike most of its parametric counterparts, we do not assume specific models for data distributions, intra-cluster dependence structure or variability, in effect …
Cancer Phylogenetic Analysis Based On Rna-Seq Data, Tingting Zhai
Cancer Phylogenetic Analysis Based On Rna-Seq Data, Tingting Zhai
Theses and Dissertations--Statistics
Studying tumor evolution is a major task to understand the biological mechanism of carcinogenesis, develop new cancer therapies, and prevent drug resistance. We focus on two important questions in tumor evolution. The first question is to quantify intra-tumor heterogeneity, where multiple subclones of tumor cells with distinct transcriptomic profiles. Another question is to estimate the temporal order of alteration of key cancer pathways during tumor evolution. We present a new statistical method to 1) reconstruct the evolutionary history and population frequency of the subclonal lineages of tumor cells and 2) infer temporal order of pathway alterations in tumor evolution for …
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Semiparametric And Nonparametric Methods For Comparing Biomarker Levels Between Groups, Yuntong Li
Theses and Dissertations--Statistics
Comparing the distribution of biomarker measurements between two groups under either an unpaired or paired design is a common goal in many biomarker studies. However, analyzing biomarker data is sometimes challenging because the data may not be normally distributed and contain a large fraction of zero values or missing values. Although several statistical methods have been proposed, they either require data normality assumption, or are inefficient. We proposed a novel two-part semiparametric method for data under an unpaired setting and a nonparametric method for data under a paired setting. The semiparametric method considers a two-part model, a logistic regression for …
Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu
Estimation Of The Treatment Effect With Bayesian Adjustment For Covariates, Li Xu
Theses and Dissertations--Statistics
The Bayesian adjustment for confounding (BAC) is a Bayesian model averaging method to select and adjust for confounding factors when evaluating the average causal effect of an exposure on a certain outcome. We extend the BAC method to time-to-event outcomes. Specifically, the posterior distribution of the exposure effect on a time-to-event outcome is calculated as a weighted average of posterior distributions from a number of candidate proportional hazards models, weighing each model by its ability to adjust for confounding factors. The Bayesian Information Criterion based on the partial likelihood is used to compare different models and approximate the Bayes factor. …
Measuring Change: Prediction Of Early Onset Sepsis, Aric Schadler
Measuring Change: Prediction Of Early Onset Sepsis, Aric Schadler
Theses and Dissertations--Statistics
Sepsis occurs in a patient when an infection enters into the blood stream and spreads throughout the body causing a cascading response from the immune system. Sepsis is one of the leading causes of morbidity and mortality in today’s hospitals. This is despite published and accepted guidelines for timely and appropriate interventions for septic patients. The largest barrier to applying these interventions is the early identification of septic patients. Early identification and treatment leads to better outcomes, shorter lengths of stay, and financial savings for healthcare institutions. In order to increase the lead time in recognizing patients trending towards septicemia …