Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Physical Sciences and Mathematics

Nonparametric Methods For Analysis And Sizing Of Cluster Randomization Trials With Baseline Measurements, Chengchun Yu Sep 2023

Nonparametric Methods For Analysis And Sizing Of Cluster Randomization Trials With Baseline Measurements, Chengchun Yu

Electronic Thesis and Dissertation Repository

Cluster randomization trials are popular in situations where the intervention needs to be implemented at the cluster level, or logistical, financial and/or ethical reason dictates the choice for randomization at the cluster level, or minimization of contamination is needed. It is very common for cluster trials to take measurements before randomization and again at follow-up, resulting in a clustered pretest-posttest design. For continuous outcomes, the cluster-adjusted analysis of covariance approach can be used to adjust for accidental bias and improve efficiency. However, a direct application of this method is nonsensical if the measures are incompatible with an interval scale, yet …


An Interval-Valued Random Forests, Paul Gaona Partida Aug 2023

An Interval-Valued Random Forests, Paul Gaona Partida

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

There is a growing demand for the development of new statistical models and the refinement of established methods to accommodate different data structures. This need arises from the recognition that traditional statistics often assume the value of each observation to be precise, which may not hold true in many real-world scenarios. Factors such as the collection process and technological advancements can introduce imprecision and uncertainty into the data.

For example, consider data collected over a long period of time, where newer measurement tools may offer greater accuracy and provide more information than previous methods. In such cases, it becomes crucial …


Nonparametric Estimation Of Elliptical Copulas, Panfeng Liang May 2023

Nonparametric Estimation Of Elliptical Copulas, Panfeng Liang

Open Access Theses & Dissertations

Elliptical copulas provide flexibility in modeling the dependence structure of a random vector. They are often parameterized with a correlation matrix and a scalar function, called generator. The estimation of the generator can be challenging, because it is a functional parameter. In this dissertation, we provide a rigorous approach to estimating the generator in a Bayesian framework, which is simpler, more robust, and outperforms existing estimation methods in the literature. Based on the proposed framework in this dissertation, other researchers may modify the model for other types of generators in their own research.


Parametric, Nonparametric, And Semiparametric Linear Regression In Classical And Bayesian Statistical Quality Control, Chelsea L. Jones Jan 2021

Parametric, Nonparametric, And Semiparametric Linear Regression In Classical And Bayesian Statistical Quality Control, Chelsea L. Jones

Theses and Dissertations

Statistical process control (SPC) is used in many fields to understand and monitor desired processes, such as manufacturing, public health, and network traffic. SPC is categorized into two phases; in Phase I historical data is used to inform parameter estimates for a statistical model and Phase II implements this statistical model to monitor a live ongoing process. Within both phases, profile monitoring is a method to understand the functional relationship between response and explanatory variables by estimating and tracking its parameters. In profile monitoring, control charts are often used as graphical tools to visually observe process behaviors. We construct a …


An Evaluation Of Knot Placement Strategies For Spline Regression, William Klein Jan 2021

An Evaluation Of Knot Placement Strategies For Spline Regression, William Klein

CMC Senior Theses

Regression splines have an established value for producing quality fit at a relatively low-degree polynomial. This paper explores the implications of adopting new methods for knot selection in tandem with established methodology from the current literature. Structural features of generated datasets, as well as residuals collected from sequential iterative models are used to augment the equidistant knot selection process. From analyzing a simulated dataset and an application onto the Racial Animus dataset, I find that a B-spline basis paired with equally-spaced knots remains the best choice when data are evenly distributed, even when structural features of a dataset are known …


Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu Jan 2020

Nonparametric Tests Of Lack Of Fit For Multivariate Data, Yan Xu

Theses and Dissertations--Statistics

A common problem in regression analysis (linear or nonlinear) is assessing the lack-of-fit. Existing methods make parametric or semi-parametric assumptions to model the conditional mean or covariance matrices. In this dissertation, we propose fully nonparametric methods that make only additive error assumptions. Our nonparametric approach relies on ideas from nonparametric smoothing to reduce the test of association (lack-of-fit) problem into a nonparametric multivariate analysis of variance. A major problem that arises in this approach is that the key assumptions of independence and constant covariance matrix among the groups will be violated. As a result, the standard asymptotic theory is not …


Surviving A Civil War: Expanding The Scope Of Survival Analysis In Political Science, Andrew B. Whetten Dec 2018

Surviving A Civil War: Expanding The Scope Of Survival Analysis In Political Science, Andrew B. Whetten

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Survival Analysis in the context of Political Science is frequently used to study the duration of agreements, political party influence, wars, senator term lengths, etc. This paper surveys a collection of methods implemented on a modified version of the Power-Sharing Event Dataset (which documents civil war peace agreement durations in the Post-Cold War era) in order to identify the research questions that are optimally addressed by each method. A primary comparison will be made between a Cox Proportional Hazards Model using some advanced capabilities in the glmnet package, a Survival Random Forest Model, and a Survival SVM. En route to …


Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen May 2018

Evaluation Of Using The Bootstrap Procedure To Estimate The Population Variance, Nghia Trong Nguyen

Electronic Theses and Dissertations

The bootstrap procedure is widely used in nonparametric statistics to generate an empirical sampling distribution from a given sample data set for a statistic of interest. Generally, the results are good for location parameters such as population mean, median, and even for estimating a population correlation. However, the results for a population variance, which is a spread parameter, are not as good due to the resampling nature of the bootstrap method. Bootstrap samples are constructed using sampling with replacement; consequently, groups of observations with zero variance manifest in these samples. As a result, a bootstrap variance estimator will carry a …


Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara Jan 2018

Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara

Dissertations, Master's Theses and Master's Reports

Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient.

This work presents development of computationally efficient algorithms for high-dimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is …


Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek Aug 2017

Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek

Electronic Theses and Dissertations

ABSTRACT

Examination and Comparison of the Performance of Common Non-Parametric and Robust Regression Models

By

Gregory Frank Malek

Stephen F. Austin State University, Masters in Statistics Program,

Nacogdoches, Texas, U.S.A.

g_m_2002@live.com

This work investigated common alternatives to the least-squares regression method in the presence of non-normally distributed errors. An initial literature review identified a variety of alternative methods, including Theil Regression, Wilcoxon Regression, Iteratively Re-Weighted Least Squares, Bounded-Influence Regression, and Bootstrapping methods. These methods were evaluated using a simple simulated example data set, as well as various real data sets, including math proficiency data, Belgian telephone call data, and faculty …


Tree-Based Regression For Interval-Valued Data, Chih-Ching Yeh Aug 2017

Tree-Based Regression For Interval-Valued Data, Chih-Ching Yeh

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Regression methods for interval-valued data have been increasingly studied in recent years. As most of the existing works focus on linear models, it is important to note that many problems in practice are nonlinear in nature and therefore development of nonlinear regression tools for intervalvalued data is crucial. In this project, we propose a tree-based regression method for interval-valued data, which is well applicable to both linear and nonlinear problems. Unlike linear regression models that usually require additional constraints to ensure positivity of the predicted interval length, the proposed method estimates the regression function in a nonparametric way, so the …


The Nonparametric Estimation Of Elliptical Distributions, Panfeng Liang Jan 2017

The Nonparametric Estimation Of Elliptical Distributions, Panfeng Liang

Open Access Theses & Dissertations

In practice, many multivariate datasets have identical marginal distributions. Elliptical distributions can be used to model many of those datasets. In this Thesis, we will propose a Bayesian method using Markov chain Monte Carlo (MCMC) methods to estimate the density function underlying multivariate datasets assuming it is an elliptical distribution.


Bayesian Nonparametric Approaches To Multiple Testing, Density Estimation, And Supervised Learning, William Cipolli Iii Jun 2016

Bayesian Nonparametric Approaches To Multiple Testing, Density Estimation, And Supervised Learning, William Cipolli Iii

Theses and Dissertations

This dissertation presents methods for several applications of Polya tree models. These novel nonparametric approaches to the problems of multiple testing, density estimation and supervised learning provide an alternative to other parametric and nonparametric models. In Chapter 2, the proposed approximate finite Polya tree multiple testing procedure is very successful in correctly classifying the observations with non-zero mean in a computationally efficient manner; this holds even when the non-zero means are simulated from a mean-zero distribution. Further, the model is capable of this for “interestingly different” observations in the cases where that is of interest. Chapter 3 proposes discrete, and …


Disk Diffusion Breakpoint Determination Using A Bayesian Nonparametric Variation Of The Errors-In-Variables Model, Glen Richard Depalma Oct 2013

Disk Diffusion Breakpoint Determination Using A Bayesian Nonparametric Variation Of The Errors-In-Variables Model, Glen Richard Depalma

Open Access Dissertations

Drug dilution (MIC) and disk diffusion (DIA) are the two most common antimicrobial susceptibility tests used by hospitals and clinics to determine an unknown pathogen's susceptibility to various antibiotics. Both tests use breakpoints to classify the pathogen as either susceptible, indeterminant, or resistant to each drug under consideration. While the determination of these drug-specific MIC classification breakpoints is straightforward, determination of comparable DIA breakpoints is not. It is this issue that motivates this research.

Traditionally, the error-rate bounded (ERB) method has been used to calibrate the two tests. This procedure involves determining DIA breakpoints which minimize the observed discrepancies between …


Comparison Of Two Samples By A Nonparametric Likelihood-Ratio Test, William H. Barton Jan 2010

Comparison Of Two Samples By A Nonparametric Likelihood-Ratio Test, William H. Barton

University of Kentucky Doctoral Dissertations

In this dissertation we present a novel computational method, as well as its software implementation, to compare two samples by a nonparametric likelihood-ratio test. The basis of the comparison is a mean-type hypothesis. The software is written in the R-language [4]. The two samples are assumed to be independent. Their distributions, which are assumed to be unknown, may be discrete or continuous. The samples may be uncensored, right-censored, left-censored, or doubly-censored. Two software programs are offered. The first program covers the case of a single mean-type hypothesis. The second program covers the case of multiple mean-type hypotheses. For the first …


Nonparametric Confidence Intervals For The Reliability Of Real Systems Calculated From Component Data, Jean Spooner May 1987

Nonparametric Confidence Intervals For The Reliability Of Real Systems Calculated From Component Data, Jean Spooner

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A methodology which calculates a point estimate and confidence intervals for system reliability directly from component failure data is proposed and evaluated. This is a nonparametric approach which does not require the component time to failures to follow a known reliability distribution.

The proposed methods have similar accuracy to the traditional parametric approaches, can be used when the distribution of component reliability is unknown or there is a limited amount of sample component data, are simpler to compute, and use less computer resources. Depuy et al. (1982) studied several parametric approaches to calculating confidence intervals on system reliability. The test …


A Monte Carlo Comparison Of Nonparametric Reliability Estimators, Jia-Jinn Yueh Jan 1973

A Monte Carlo Comparison Of Nonparametric Reliability Estimators, Jia-Jinn Yueh

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

It is very difficult to construct a reliability model for a complex system. However, the reliability model for a series configuration is relatively simple. In the simplest case in which the components are mutually independent, the system reliability can be represented as follows:

Rs(x) = ∑ni=1Ri(x),

where Ri is the reliability for the ith component. It is also known that for moderate levels of system reliability for large systems, the component reliability must be high.

Extreme Value Theory indicates that under very general conditions, the initial form of the distribution function …


A Nonparametric Solution For Finding The Optimum Useful Life Of Equipment, Barry T. Stoll Jan 1973

A Nonparametric Solution For Finding The Optimum Useful Life Of Equipment, Barry T. Stoll

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

It is often the case that equipment used by industry must be replaced with new equipment from time to time either because frequent malfunctions make it too costly to repair, or because the equipment has simply worn out. The new equipment often has the nature of either malfunctioning soon after installation due to manufacturing defects, or functioning for an extended period of time because it is free of these defects. For this reason, equipment is often given a preliminary running called the burn-in which gives no useful output but merely tests for manufacturing defects. Also, after a given amount of …


A Monte Carlo Evaluation Of A Nonparametric Technique For Estimating The Hazard Function, Sheng Jia Lin May 1971

A Monte Carlo Evaluation Of A Nonparametric Technique For Estimating The Hazard Function, Sheng Jia Lin

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

This research is primarily concerned with the estimation of the Hazard functions, the Hazard function is the failure rate at time t, and is defined as -R '(t)/R(t), so it plays an important role in Reliability.

In order to compare and evaluate the estimation methods, it is convenient to select one distribution in this research. Since the Weibull distribution is a useful distribution in Reliability, the Weibull distribution is used in this paper.


Nonparametric Test Of Fit, Frena Nawabi May 1970

Nonparametric Test Of Fit, Frena Nawabi

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Most statistical methods require assumptions about the populations from which samples are taken. Usually these methods measure the parameters, such as variance, standard deviations, means, etc., of the respective populations. One example is the assumption that a given population can be approximated closely with a normal curve. Since these assumptions are not always valid, statisticians have developed several alternate techniques known as nonparametric tests. The models of such tests do not specify conditions about population parameters.

Certain assumptions, such as (1) observations are independent and (2) the variable being studied has underlying continuity, are associated with most nonparametric tests. However, …