Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 89

Full-Text Articles in Applied Statistics

Quantum Computing Simulation Of The Hydrogen Molecule System With Rigorous Quantum Circuit Derivations, Yili Zhang Aug 2022

Quantum Computing Simulation Of The Hydrogen Molecule System With Rigorous Quantum Circuit Derivations, Yili Zhang

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Quantum computing has been an emerging technology in the past few decades. It utilizes the power of programmable quantum devices to perform computation, which can solve complex problems in a feasible time that is impossible with classical computers. Simulating quantum chemical systems using quantum computers is one of the most active research fields in quantum computing. However, due to the novelty of the technology and concept, most materials in the literature are not accessible for newbies in the field and sometimes can cause ambiguity for practitioners due to missing details.

This report provides a rigorous derivation of simulating quantum chemistry …


Comparing Performance Of Gene Set Test Methods Using Biologically Relevant Simulated Data, Richard M. Lambert Dec 2018

Comparing Performance Of Gene Set Test Methods Using Biologically Relevant Simulated Data, Richard M. Lambert

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Today we know that there are many genetically driven diseases and health conditions. These problems often manifest only when a set of genes are either active or inactive. Recent technology allows us to measure the activity level of genes in cells, which we call gene expression. It is of great interest to society to be able to statistically compare the gene expression of a large number of genes between two or more groups. For example, we may want to compare the gene expression of a group of cancer patients with a group of non-cancer patients to better understand the genetic …


Examining Quadratic Relationships Between Traits And Methods In Two Multitrait-Multimethod Models, Fredric A. Hintz May 2018

Examining Quadratic Relationships Between Traits And Methods In Two Multitrait-Multimethod Models, Fredric A. Hintz

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Psychological researchers are interested in the validity of the measures they use, and the multitrait-multimethod design is one of the most frequently employed methods to examine validity. Confirmatory factor analysis is now a commonly used analytic tool for examining multitrait-multimethod data, where an underlying mathematical model is fit to data and the amount of variance due to the trait and method factors is estimated. While most contemporary confirmatory factor analysis methods for examining multi-trait multi-method data do not allow relationships between the trait and method factors, a few recently proposed models allow for the examination of linear relationships between traits …


Prediction Of Stress Increase In Unbonded Tendons Using Sparse Principal Component Analysis, Eric Mckinney Aug 2017

Prediction Of Stress Increase In Unbonded Tendons Using Sparse Principal Component Analysis, Eric Mckinney

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

While internal and external unbonded tendons are widely utilized in concrete structures, the analytic solution for the increase in unbonded tendon stress, Δ���, is challenging due to the lack of bond between strand and concrete. Moreover, most analysis methods do not provide high correlation due to the limited available test data. In this thesis, Principal Component Analysis (PCA), and Sparse Principal Component Analysis (SPCA) are employed on different sets of candidate variables, amongst the material and sectional properties from the database compiled by Maguire et al. [18]. Predictions of Δ��� are made via Principal Component Regression models, and the method …


Statistical Methods For Assessing Individual Oocyte Viability Through Gene Expression Profiles, Michael O. Bishop May 2017

Statistical Methods For Assessing Individual Oocyte Viability Through Gene Expression Profiles, Michael O. Bishop

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Abstract

Statistical Methods for Assessing Individual Oocyte Viability Through Gene Expression Profiles

By

Michael O. Bishop

Utah State University, 2017

Major Professor: Dr. John R. Stevens

Department: Mathematics and Statistics

Oocytes are the precursor cells to the female gamete, or egg. While reproduction may vary from species to species, within humans and most domesticated animals, the oocyte maturation process is fairly similar. As an oocyte matures, there are various processes that take place, all of which have an effect on the viability of the individual oocyte. Barring outside damage that may come to the oocyte, one of the primary reasons …


Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers May 2017

Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Survival analysis methods are a mainstay of the biomedical fields but are finding increasing use in other disciplines including finance and engineering. A widely used tool in survival analysis is the Cox proportional hazards regression model. For this model, all the predicted survivor curves have the same basic shape, which may not be a good approximation to reality. In contrast the Random Survival Forests does not make the proportional hazards assumption and has the flexibility to model survivor curves that are of quite different shapes for different groups of subjects. We applied both techniques to a number of publicly available …


A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone May 2017

A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

A community ecologist provided a motivating data set involving a certain animal species with two behavior groups, along with a pairwise genetic distance matrix among individuals. Many community ecologists have analyzed similar data sets with a method known as the Hopkins method, testing for an association between the subject-level covariate (behavior group) and the pairwise distance. This community ecologist wanted to know if they used the Hopkins method, would their results be meaningful? Their question inspired this thesis work, where a different data set was used for confidentiality reasons. Multiple methods (Hopkins method, ADONIS, ANOSIM, and Distance Regression) were used …


Telephone Polls And Pps Sampling: A Potential Boon To The Polling Industry, Jade Mckay Burt May 2017

Telephone Polls And Pps Sampling: A Potential Boon To The Polling Industry, Jade Mckay Burt

Undergraduate Honors Capstone Projects

In the wake of the 2016 election, the polling industry has no shortage of critics. While these are difficult times for the industry as a whole, there are exciting innovations happening that will serve to benefit and revitalize the industry for years. One of these exciting innovations is Probability Proportional to Size (PPS) sampling. I will elaborate on what PPS sampling is and provide a mathematical foundation for its use in polling. I also discuss what some of the myriad of issues plaguing the polling industry are and then show how PPS sampling can be used to remedy many of …


Computational Topics In Lie Theory And Representation Theory, Thomas J. Apedaile May 2014

Computational Topics In Lie Theory And Representation Theory, Thomas J. Apedaile

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The computer algebra system Maple contains a basic set of commands for working with Lie algebras and matrices. The purpose of this thesis was to extend the functionality of these Maple packages in a number of important areas. First, programs for defining multiplication in several different types of algebras were created to allow users to perform a wider variety of calculations. Second, commands were created for calculating some basic properties of matrix representations of semisimple Lie algebras. This allows a user to identify a given matrix representation by a collection of integers which do not change when the basis of …


Implementation And Application Of The Curds And Whey Algorithm To Regression Problems, John Kidd May 2014

Implementation And Application Of The Curds And Whey Algorithm To Regression Problems, John Kidd

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A common statistical problem is trying to predict two or more variables using a set of predictor variables. The simplest model for this situation is called multivariate linear regression. This method uses each set of predictor variables to predict each of the response variables separately. This approach seems counter-intuitive as any possible relationship between the variables being predicted is ignored.

Breiman and Friedman found a way to take advantage of relationships among the response variables to increase the accuracy of the predictions for each of the predicted variables with an algorithm they called Curds and
Whey. It uses other statistical …


Family-Wise Error Rate Control In Quantitative Trait Loci (Qtl) Mapping And Gene Ontology Graphs With Remarks On Family Selection, Garrett Saunders May 2014

Family-Wise Error Rate Control In Quantitative Trait Loci (Qtl) Mapping And Gene Ontology Graphs With Remarks On Family Selection, Garrett Saunders

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

One of the great aims of statistics, the science of collecting, analyzing, and interpreting data, is to protect against the probability of falsely rejecting an accepted claim, or hypothesis, given observed data stemming from some experiment. This is generally known as protecting against a Type I Error, or controlling the Type I Error rate. The extension of this protection against Type I Errors to the situation where thousands upon thousands of hypothesis are examined simultaneously is known as multiple hypothesis testing. This dissertation presents an improvement to an existing multiple hypothesis testing approach, the Focus Level method, specific to gene …


Beetles, Fungi And Trees: A Story For The Ages? Modeling And Projecting The Multipartite Symbiosis Between The Mountain Pine Beetle, Dendroctonus Ponderosae, And Its Fungal Symbionts, Grosmannia Clavigera And Ophiostoma Montium, Audrey L. Addison May 2014

Beetles, Fungi And Trees: A Story For The Ages? Modeling And Projecting The Multipartite Symbiosis Between The Mountain Pine Beetle, Dendroctonus Ponderosae, And Its Fungal Symbionts, Grosmannia Clavigera And Ophiostoma Montium, Audrey L. Addison

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

As data collection and modeling improve, ecologists increasingly discover that interspecies dynamics greatly affect the success of individual species. Models accounting for the dynamics of multiple species are becoming more important. In this work, we explore the relationship between mountain pine beetle (MPB, Dendroctonus ponderosae Hopkins) and two mutualistic fungi, Grosmannia clavigera and Ophiostoma montium. These species are involved in a multipartite symbiosis, critical to the survival of MPB, in which each species benefits.

Extensive phenological modeling has been done to determine how temperature affects the timing of life events and cold-weather mortality of MPB. The fungi have also …


Physically Based Preconditioning Techniques Applied To The First Order Particle Transport And To Fluid Transport In Porous Media, Michael Rigley May 2014

Physically Based Preconditioning Techniques Applied To The First Order Particle Transport And To Fluid Transport In Porous Media, Michael Rigley

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Solving linear systems is at the heart of many scientific applications from the PreAlgebra's student solving for x and y for basic geometry problems to the computational scientist solving billions of equations with billions of variables for weather forecasting, modeling fusion reactions, or web search algorithms. In this study we look at improving the efficiency of solving large linear systems that result from two applications. The first includes linear systems that result from solving differential equations for the movement of atomic particles in particle emitting, void, and absorbing regions. The second includes solving linear systems that result from solving differential …


Collecting, Analyzing And Interpreting Bivariate Data From Leaky Buckets: A Project-Based Learning Unit, Florence Funmilayo Obielodan May 2011

Collecting, Analyzing And Interpreting Bivariate Data From Leaky Buckets: A Project-Based Learning Unit, Florence Funmilayo Obielodan

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Despite the significance and the emphasis placed on mathematics as a subject and field of study, achieving the right attitude to improve students‟ understanding and performance is still a challenge. Previous studies have shown that the problem cuts across nations around the world, both developing countries and developed alike. Teachers and educators of the subject have responsibilities to continuously develop innovative pedagogical approaches that will enhance students‟ interests and performance. Teaching approaches that emphasize real life applications of the subject have become imperative. It is believed that this will stimulate learners‟ interest in the subject as they will be able …


Probe-Level Statistical Models For Differential Expression Of Genes In Bovine Nt Studies, Jason L. Bell Jan 2009

Probe-Level Statistical Models For Differential Expression Of Genes In Bovine Nt Studies, Jason L. Bell

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

A brief introduction of microarray technology and its uses is given. This technology is commonly used in agricultural research, including research in nuclear transfer, which motivated this study. There are 3 classes of statistical models compared: probeset-level, weighted probeset-level and probe-level.

Different statistical mod els are compared on 3 spike-in experiments to assess the relative performance in identifying differentially expressed genes . A novel nested factorial model was found to outperform all other models compared in this study in one spike-in experiment, and was found to be competitive in its performance relative to the other models on the other spike-in …


Comparison Of Random Forests And Cforest: Variable Importance Measures And Prediction Accuracies, Rong Xia Jan 2009

Comparison Of Random Forests And Cforest: Variable Importance Measures And Prediction Accuracies, Rong Xia

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Random forests are ensembles of trees that give accurate predictions for regression, classification and clustering problems. The CART tree, the base learn er employed by random forests, has been criticized because of bias in the selection of splitting variables. The performance of random forests is suspect due to this criticism. A new implementation of random forests, Cforest, which is claimed to outperform random forests in both predictive power and variable importance measures , was developed based on Ctree, an implementation of conditional inference trees.

We address the underlying mechanism of random forests and Cforest in this report. Comparison of random …


Probability Of Discrete Failures, Weibull Distribution, Mary Jo Hansen May 1989

Probability Of Discrete Failures, Weibull Distribution, Mary Jo Hansen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The intent of this research and these is to describe the development of a series of charts and tables that provide the individual and cumulative probabilities of failure applying to the Weibull statistical distribution. The mathematical relationships are developed and the computer programs are described for deterministic and Monte Carlo models that compute and verify the results. Charts and tables reflecting the probabilities of failure for a selected set of parameters of the Weibull distribution functions are provided.


Overall Life Satisfaction Of Ileostomates: Conventional Brooke Ileostomy Versus Modified Kock Pouch, Sandra Sisson Briscoe May 1988

Overall Life Satisfaction Of Ileostomates: Conventional Brooke Ileostomy Versus Modified Kock Pouch, Sandra Sisson Briscoe

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The purpose of this thesis is to analyze various aspects of quality of life and to determine if there is a difference in quality of life offered by a conventional ileostomy versus a continent ileostomy.

An instrument was developed to measure several factors thought to influence quality of life as well as several structural/demographic variables. This instrument was designed for persons with a conventional ileostomy and was modified for persons who had undergone conversion surgery from conventional to continent ileostomy.

Analysis of variance was performed to determine differences in quality of life for persons with a conventional, conversion, or original …


Parameter Estimation For Generalized Pareto Distribution, Der-Chen Lin May 1988

Parameter Estimation For Generalized Pareto Distribution, Der-Chen Lin

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The generalized Pareto distribution was introduced by Pickands (1975). Three methods of estimating the parameters of the generalized Pareto distribution were compared by Hosking and Wallis (1987). The methods are maximum likelihood, method of moments and probability-weighted moments.

An alternate method of estimation for the generalized Pareto distribution, based on least square regression of expected order statistics (REOS), is developed and evaluated in this thesis. A Monte Carlo comparison is made between this method and the estimating methods considered by Hosking and Wallis (1987). This method is shown to be generally superior to the maximum likelihood, method of moments and …


Comparison Of Bootstrap With Other Tests For Several Distributions, Yu-Yu Wong May 1988

Comparison Of Bootstrap With Other Tests For Several Distributions, Yu-Yu Wong

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

This paper discusses results of a computer simulation to investigate several different tests when sampling several distributions. The hypothesis H0: μ=0 was tested against H0: μ≠0, using the usual t-test, trimmed t-test, the Jackkinfe, the Bootstrap and signed-rank test. The p-values and empirical power show that the Bootstrap is as good as the t-test. The Jackknife procedure is too liberal, always obtaining small p-values. The signed-rank is a fairly good test if the data follows the Cauchy Distribution.


A Test For Determining An Appropriate Model For Accelerated Life Data, Yuan-Who Chen May 1987

A Test For Determining An Appropriate Model For Accelerated Life Data, Yuan-Who Chen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The purpose of this thesis was to evaluate a method for testing the appropriateness of accelerated life model. This method is based upon a polynomial approximation. The parameters are estimated and used for testing the appropriateness of the model.

An example illustrates the polynomial method. Real data are applied for this method. Comparison with another method demonstrates that the polynomial method is much simpler and has comparable accuracy.


Nonparametric Confidence Intervals For The Reliability Of Real Systems Calculated From Component Data, Jean Spooner May 1987

Nonparametric Confidence Intervals For The Reliability Of Real Systems Calculated From Component Data, Jean Spooner

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

A methodology which calculates a point estimate and confidence intervals for system reliability directly from component failure data is proposed and evaluated. This is a nonparametric approach which does not require the component time to failures to follow a known reliability distribution.

The proposed methods have similar accuracy to the traditional parametric approaches, can be used when the distribution of component reliability is unknown or there is a limited amount of sample component data, are simpler to compute, and use less computer resources. Depuy et al. (1982) studied several parametric approaches to calculating confidence intervals on system reliability. The test …


A Comparison Of Rank And Bootstrap Procedures For Completely Randomized Designs With Jittering, Feng-Ling Lee May 1987

A Comparison Of Rank And Bootstrap Procedures For Completely Randomized Designs With Jittering, Feng-Ling Lee

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

This paper discusses results of a computer simulation to investigate the effect of jittering to simulate measurement error. In addition, the classical F ratio, the bootstrap F and the F for ranked data are compared. Empirical powers and p-values suggest the bootstrap is a good and robust procedure and the rank procedure seems to be too liberal when compared to the classical F ratio.


Computer Program Generation Of Extreme Value Distribution Data, Stephen (Wan-Tsing) Lei Jan 1986

Computer Program Generation Of Extreme Value Distribution Data, Stephen (Wan-Tsing) Lei

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

The application of the Monte Carlo method on the estimation in Gumbel extreme value distribution was studied. The Gumbel extreme value distribution is used to estimate the flood flow of specific return period for the design of flood mitigation project. This paper is a programming effort (1) to estimate the parameters of Gumbel distribution using the observed data and (2) to provide a random variate generating subroutine to generate random samples and order statistics of a Gumbel distribution random variable. The mean squared error is used to measure the accuracy of the estimation method. Finally, an example of the use …


Data Analysis Using Experimental Design Model Factorial Analysis Of Variance/Covariance (Dmaovc.Bas), Wesley E. Newton May 1985

Data Analysis Using Experimental Design Model Factorial Analysis Of Variance/Covariance (Dmaovc.Bas), Wesley E. Newton

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

DMAOVC.BAS is a computer program written in the compiler version of microsoft basic which performs factorial analysis of variance/covariance with expected mean squares. The program accommodates factorial and other hierarchical experimental designs with balanced sets of data. The program is writ ten for use on most modest sized microprocessors, in which the compiler is available. The program is parameter file driven where the parameter file consists of the response variable structure, the experimental design model expressed in a similar structure as seen in most textbooks, information concerning the factors (i.e. fixed or random, and the number of levels), and necessary …


Monte Carlo Simulation Of The Game Of Twenty-One, Douglas E. Loer Jan 1985

Monte Carlo Simulation Of The Game Of Twenty-One, Douglas E. Loer

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

The purpose of this paper is to demonstrate the application of computer simulation to the game of Twenty-One to predict a player's expected return from the game. Twenty-One has traditionally been one of the most popular casino games and has attracted much effort to accurately estimate the house's true advantage. Probability theory has been tried, but the thousands of different combinations of cards possible in all hands throughout the entire pack make it practically impossible to apply probability theory without overlooking some possibilities. For this reason, Twenty-One is a perfect candidate for simulation. By blocking several simulations, normal theory can …


Unbalanced Analysis Of Variance Comparing Standard And Proposed Approximation Techniques For Estimating The Variance Components, James P. Pugsley May 1984

Unbalanced Analysis Of Variance Comparing Standard And Proposed Approximation Techniques For Estimating The Variance Components, James P. Pugsley

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

This paper considers the estimation of the components of variation for a two-factor unbalanced nested design and compares standard techniques with proposed approximation procedures. Current procedures are complicated and assume the unbalanced sample size to be fixed. This paper tests some simpler techniques, assuming sample sizes are random variables. Monte Carlo techniques were used to generate data for testing of these new procedures.


Correction Of Bias In Estimating Autocovariance Function, Len-Hong Wu May 1983

Correction Of Bias In Estimating Autocovariance Function, Len-Hong Wu

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The purpose of this thesis was to evaluate a method for reducing the bias of estimation for autocovariance estimators. Two methods are compared, one is the standard method and the other is an adjustment method. The Monte Carlo method is used within comparison.

The bias and the mean squared error of the estimated autocovariance is computed for several time series models and two variations of the adjustment method of estimation. The results indicate some improvement in bias and mean squared error for the new method.


The Use Of Contingency Table Analysis As A Robust Technique For Analysis Of Variance, Mei-Eing Chiu May 1982

The Use Of Contingency Table Analysis As A Robust Technique For Analysis Of Variance, Mei-Eing Chiu

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The purpose of this paper is to compare Analysis of Variance with Contingency Table Analysis when the data being analyzed do not satisfy Analysis of Variance assumptions. The criteria for comparison are the powers of the Standard variance-ratio and the Chi-square test.

The test statistic and powers were obtained by Monte Carlo.

1. Calculate test statistic for each of 100 trials, this process was repeated 12 times. Each time different combination of means and variances were used.

2. Powers were obtained for each of 12 combinations of means and variances.

Whether Analysis of Variance or Contingency Table Analysis is a …


Least Squares Estimation Of The Pareto Type I And Ii Distribution, Ching-Hua Chien May 1982

Least Squares Estimation Of The Pareto Type I And Ii Distribution, Ching-Hua Chien

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The estimation of the Pareto distribution can be computationally expensive and the method is badly biased. In this work, an improved Least Squares derivation is used and the estimation will be less biased. Numerical examples and figures are provided so that one may observe the solution more clearly. Furthermore, by varying the different methods of estimation, a comparing of the estimators of the parameters is given. The improved Least Squares derivation is confidently employed for it is economic and efficient.