Open Access. Powered by Scholars. Published by Universities.®

Other Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

173 Full-Text Articles 353 Authors 91064 Downloads 43 Institutions

All Articles in Other Statistics and Probability

Faceted Search

173 full-text articles. Page 1 of 7.

Imputation For Random Forests, Joshua Young 2017 Utah State University

Imputation For Random Forests, Joshua Young

All Graduate Plan B and other Reports

This project introduces two new methods for imputation of missing data in random forests. The new methods are compared against other frequently used imputation methods, including those used in the randomForest package in R. To test the effectiveness of these methods, missing data are imputed into datasets that contain two missing data mechanisms including missing at random and missing completely at random. After imputation, random forests are run on the data and accuracies for the predictions are obtained. Speed is an important aspect in computing; the speeds for all the tested methods are also compared.

One of the new methods ...


Balanced Incomplete Block And Lattice Square Designs For Testing Yield Differences Among Large Numbers Of Soybean Varieties, Martin G. Weiss, Gertrude M. Cox 2017 Iowa State College

Balanced Incomplete Block And Lattice Square Designs For Testing Yield Differences Among Large Numbers Of Soybean Varieties, Martin G. Weiss, Gertrude M. Cox

Research Bulletin (Iowa Agriculture and Home Economics Experiment Station)

Two quasi-factorial arrangements which are especially well adapted to the testing of differences between large numbers of varieties are described and treated in detail as to their analysis and value.

The arrangements described are balanced incomplete block and lattice square designs. Soybean variety trials are used to illustrate the analysis and the relative precision on soils of varying homogeneity.

Although the efficiency factor of these designs, because of the confounding of variety differences with block effects, is lower than that of randomized complete block designs, yet on soil of normal variability the designs permit the elimination of sufficient variability due ...


Tests Of Significance In Reversal Or Switchback Trials, A. E. Brandt 2017 Iowa State College

Tests Of Significance In Reversal Or Switchback Trials, A. E. Brandt

Research Bulletin (Iowa Agriculture and Home Economics Experiment Station)

By extending "Student's" t-test to differences higher than the first, a method is provided for analyzing the results of reyersal tests employing as many periods as are practical with the organism and test used. It is shown in Part II that identical results can be obtained by the methods of analysis of variance so that the investigator may at will use either the methods presented in Part I or those in Part II if but one attribute of the experimental units is measured. But, if the investigator has one or more other measures relevant to his experimental results and ...


Design Of Agronomic Experiments For Plots Differentiated In Fertility By Past Treatments, H. C. Forester 2017 Iowa State College

Design Of Agronomic Experiments For Plots Differentiated In Fertility By Past Treatments, H. C. Forester

Research Bulletin (Iowa Agriculture and Home Economics Experiment Station)

With the increase in agricultural field trials in recent years, the question of the use of former sites of concluded experiments is becoming of ever increasing importance. It was the aim of the present study to define the possible methods by which areas which have been differentially treated in past experiments may be effectively utilized in new experiments and to determine the efficiency of the various methods.

Methods of design were tested using the yields from a field which formed a portion of a rotation and fertilizer experiment conducted at the Iowa Agricultural Experiment Station since 1915.

The possible methods ...


Methods For Estimating Usual Intake Distributions, Alicia L. Carriquiry, Helen H. Jensen, Wayne A. Fuller, P. Guenther 2017 Iowa State University

Methods For Estimating Usual Intake Distributions, Alicia L. Carriquiry, Helen H. Jensen, Wayne A. Fuller, P. Guenther

Alicia L. Carriquiry

Assessments of dietary adequacy should rely on estimating usual nutrient intake distributions. Such estimates for a population may be obtained from data collected in dietary surveys. We propose a semiparametric approach to transform the observed intake data, which are not normally distributed, into normality and to remove the dependence between individual intake means and variances.


Disproportionate Subclass Numbers In Tables Of Multiple Classification, George W. Snedecor, Gertrude M. Cox 2017 Iowa State College

Disproportionate Subclass Numbers In Tables Of Multiple Classification, George W. Snedecor, Gertrude M. Cox

Research Bulletin (Iowa Agriculture and Home Economics Experiment Station)

Under the stimulus of some of the newer methods of experimentation there is a decided tendency toward the grouping of classes of data into smaller and more homogeneous sub-classes. The weights of swine, for example, may be simultaneously classified according to the sex as well as the litter of the individual animals. Corn yields may be entered in a three-way table by applying the criteria of variety, treatment and soil type. From the resulting tables of multiple classification can be derived information not only of the main effects, such as sex and litter, but also of the interactions between them ...


On The Analysis Of The Sir Epidemic Model For Small Networks: An Application In Hospital Settings, Martin Lopez-Garcia 2017 University of Leeds

On The Analysis Of The Sir Epidemic Model For Small Networks: An Application In Hospital Settings, Martin Lopez-Garcia

Biology and Medicine Through Mathematics Conference

No abstract provided.


Can Cone Signals In The Wild Be Predicted From The Past?, David H. Foster, Iván Marín-Franch 2017 University of Manchester, UK

Can Cone Signals In The Wild Be Predicted From The Past?, David H. Foster, Iván Marín-Franch

MODVIS Workshop

In the natural world, the past is usually a good guide to the future. If light from the sun and sky is blue earlier in the day and yellow now, then it is likely to be more yellow later, as the sun's elevation decreases. But is the light reflected from a scene into the eye as predictable as the light incident upon the scene, especially when lighting changes are not just spectral but include changes in local shadows and mutual reflections? The aim of this work was to test the predictability of cone photoreceptor signals in the wild over ...


Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons 2017 Chapman University

Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons

Student Research Day Abstracts and Posters

After going on the Warner Brothers Tour in December of 2015, I created a Gilmore Girls Instagram account. This account, which started off as a way for me to create edits of the show and post my photos from the tour turned into something bigger than I ever could have imagined. In just over a year I have over 55,000 followers. I post content including revival news, merchandise, and edits of the show that have been featured in Entertainment Weekly, Bustle, E! News, People Magazine, Yahoo News, & GilmoreNews.

I created a dataset of qualitative and quantitative outcomes from my ...


The Value Of A Win: Analysis Of Playoff Structures, Matthew Orsi 2017 Bryant University

The Value Of A Win: Analysis Of Playoff Structures, Matthew Orsi

Honors Projects in Mathematics

The purpose of this Senior Capstone project is to analyze the distinctions between existing playoff systems. In particular, we are looking to analyze the differences between the standard single-elimination tournament (which the NCAA has used since the inception of the tournament) and other potential options: double-elimination and multiple game series. Popular sports such as Major League Baseball and the National Basketball Association all use multiple game series for their playoffs. This project will use probability theory and simulation to determine the likelihood of different seeds winning a championship as well as the expected number of victories by seed in each ...


Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr 2017 Murray State University

Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr

Scholars Week

Timing methods and performance metrics are important in the heavily industrialized world we live in. Industrial plants use metrics to measure quality of production, help make decisions, and drive the strategy of the organization. However, there are many factors to be considered when measuring performance based on a metric; of which we will be analyzing the importance of product variation. We will be analyzing assembly line timings, whilst controlling for product variance, to show the importance differences between products makes in one’s ability to predict performance. In addition, we will be analyzing the current “statistical” methods used by an ...


2014 Reporting Of Sexual Assault: Institutional Comparisons, M. E. Karns 2017 Cornell University

2014 Reporting Of Sexual Assault: Institutional Comparisons, M. E. Karns

Research Studies and Reports

Institutions of higher education are required to submit annual reports of sexual assault crimes to the Department of Education under the Clery Act. The Department of Education makes this data publicly available. Two primary measures are used to assess reporting of assault on campus: the Assault Reporting Ratio (ARR) and the Reporting Rate per 10,000 students (R10K). These measures are easily calculated and can be used to assess practices and policies that impact the reporting of sexual assault on campus.

The ARR and R10K are rate comparisons, a method widely used in public health. These rate comparisons measure how ...


Perennial Warm-Season Grasses For Producing Biofuel And Enhancing Soil Properties: An Alternative To Corn Residue Removal, Humberto Blanco-Canqui, Robert B. Mitchell, Virginia L. Jin, Marty R. Schmer, Kent M. Eskridge 2017 University of Nebraska-Lincoln

Perennial Warm-Season Grasses For Producing Biofuel And Enhancing Soil Properties: An Alternative To Corn Residue Removal, Humberto Blanco-Canqui, Robert B. Mitchell, Virginia L. Jin, Marty R. Schmer, Kent M. Eskridge

Faculty Publications, Department of Statistics

Removal of corn (Zea mays L.) residues at high rates for biofuel and other off-farm uses may negatively impact soil and the environment in the long term. Biomass removal from perennial warm-season grasses (WSGs) grown in marginally-productive lands could be an alternative to corn residue removal as biofuel feedstocks while controlling water and wind erosion, sequestering carbon (C), cycling water and nutrients, and enhancing other soil ecosystem services. We compared wind and water erosion potential, soil compaction, soil hydraulic properties, soil organic C (SOC), and soil fertility between biomass removal from WSGs and corn residue removal from rainfed no-till continuous ...


Impact Of Menthol Smoking On Nicotine Dependence For Diverse Racial/Ethnic Groups Of Daily Smokers, Julia N. Soulakova, Ryan R. Danczak 2017 University of Central Florida

Impact Of Menthol Smoking On Nicotine Dependence For Diverse Racial/Ethnic Groups Of Daily Smokers, Julia N. Soulakova, Ryan R. Danczak

Faculty Publications, Department of Statistics

Introduction: The aims of this study were to evaluate whether menthol smoking and race/ethnicity are associated with nicotine dependence in daily smokers. Methods: The study used two subsamples of U.S. daily smokers who responded to the 2010–2011 Tobacco Use Supplement to the Current Population Survey. The larger subsample consisted of 18,849 non-Hispanic White (NHW), non-Hispanic Black (NHB), and Hispanic (HISP) smokers. The smaller subsample consisted of 1112 non-Hispanic American Indian/Alaska Native (AIAN), non-Hispanic Asian (ASIAN), non-Hispanic Hawaiian/Pacific Islander (HPI), and non-Hispanic Multiracial (MULT) smokers. Results: For larger (smaller) groups the rates were 45% (33 ...


Detecting Discordance Enrichment Among A Series Of Two-Sample Genome-Wide Expression Data Sets, Yinglei Lai, Fanni Zhang, Tapan Nayak, Reza Modarres, Norman H. Lee, Timothy A. McCaffrey 2017 George Washington University

Detecting Discordance Enrichment Among A Series Of Two-Sample Genome-Wide Expression Data Sets, Yinglei Lai, Fanni Zhang, Tapan Nayak, Reza Modarres, Norman H. Lee, Timothy A. Mccaffrey

Epidemiology and Biostatistics Faculty Publications

Background

With the current microarray and RNA-seq technologies, two-sample genome-wide expression data have been widely collected in biological and medical studies. The related differential expression analysis and gene set enrichment analysis have been frequently conducted. Integrative analysis can be conducted when multiple data sets are available. In practice, discordant molecular behaviors among a series of data sets can be of biological and clinical interest.

Methods

In this study, a statistical method is proposed for detecting discordance gene set enrichment. Our method is based on a two-level multivariate normal mixture model. It is statistically efficient with linearly increased parameter space when ...


Selection Portfolio: Applying Modern Portfolio Theory To Personnel Selection, Eric Leingang 2017 Minnesota State University, Mankato

Selection Portfolio: Applying Modern Portfolio Theory To Personnel Selection, Eric Leingang

All Theses, Dissertations, and Other Capstone Projects

Modern Portfolio Theory (MPT) is a framework for building a portfolio of risky assets such that the ratio of risk to return is minimized. While this theory has been used in the field of financial economics for over sixty years, the method has not yet been applied to compensatory personnel selection. A common method for personnel selection is multiple regression to maximize the predicted performance of the selected group given a cut-off score on the predictor(s). Recognizing that maximizing the performance of the selected group is not the only consideration, and that, for many jobs and organizations, the outcomes ...


Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton 2016 Utah State University

Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton

All Graduate Plan B and other Reports

Random Forests are very memory intensive machine learning algorithms and most computers would fail at building models from datasets with millions of observations. Using the Center for High Performance Computing (CHPC) at the University of Utah and an airline on-time arrival dataset with 7 million observations from the U.S. Department of Transportation Bureau of Transportation Statistics we built 316 models by adjusting the depth of the trees and randomness of each forest and compared the accuracy and time each took. Using this dataset we discovered that substantial restrictions to the size of trees, observations allowed for each tree, and ...


A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz 2016 Washington University in St. Louis

A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz

Doctor of Business Administration Dissertations

At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with ...


Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. van der Laan 2016 Division of Biostatistics, School of Public Health, University of California, Berkeley

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

Laura B. Balzer

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the ...


Methods For Estimating Usual Intake Distributions, Alicia L. Carriquiry, Helen H. Jensen, Wayne A. Fuller, P. Guenther 2016 Iowa State University

Methods For Estimating Usual Intake Distributions, Alicia L. Carriquiry, Helen H. Jensen, Wayne A. Fuller, P. Guenther

Helen Jensen

Assessments of dietary adequacy should rely on estimating usual nutrient intake distributions. Such estimates for a population may be obtained from data collected in dietary surveys. We propose a semiparametric approach to transform the observed intake data, which are not normally distributed, into normality and to remove the dependence between individual intake means and variances.


Digital Commons powered by bepress