# Other Statistics and Probability Commons™

## All Articles in Other Statistics and Probability

231 full-text articles. Page 3 of 9.

The Value Of A Win: Analysis Of Playoff Structures, 2017 Bryant University

#### The Value Of A Win: Analysis Of Playoff Structures, Matthew Orsi

##### Honors Projects in Mathematics

The purpose of this Senior Capstone project is to analyze the distinctions between existing playoff systems. In particular, we are looking to analyze the differences between the standard single-elimination tournament (which the NCAA has used since the inception of the tournament) and other potential options: double-elimination and multiple game series. Popular sports such as Major League Baseball and the National Basketball Association all use multiple game series for their playoffs. This project will use probability theory and simulation to determine the likelihood of different seeds winning a championship as well as the expected number of victories by seed in each ...

2017 Murray State University

#### Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr

##### Scholars Week

Timing methods and performance metrics are important in the heavily industrialized world we live in. Industrial plants use metrics to measure quality of production, help make decisions, and drive the strategy of the organization. However, there are many factors to be considered when measuring performance based on a metric; of which we will be analyzing the importance of product variation. We will be analyzing assembly line timings, whilst controlling for product variance, to show the importance differences between products makes in one’s ability to predict performance. In addition, we will be analyzing the current “statistical” methods used by an ...

2017 Florida International University

#### Maximum Likelihood Estimation Of Parameters In Exponential Power Distribution With Upper Record Values, Tianchen Zhi

##### FIU Electronic Theses and Dissertations

The exponential power (EP) distribution is a very important distribution that was used by survival analysis and related with asymmetrical EP distribution. Many researchers have discussed statistical inference about the parameters in EP distribution using i.i.d random samples. However, sometimes available data might contain only record values, or it is more convenient for researchers to collect record values. We aim to resolve this problem. We estimated two parameters of the EP distribution by MLE using upper record values. According to simulation study, we used the Bias and MSE of the estimators for studying the efficiency of the proposed ...

2017 Cornell University

#### 2014 Reporting Of Sexual Assault: Institutional Comparisons, M. E. Karns

##### Research Studies and Reports

Institutions of higher education are required to submit annual reports of sexual assault crimes to the Department of Education under the Clery Act. The Department of Education makes this data publicly available. Two primary measures are used to assess reporting of assault on campus: the Assault Reporting Ratio (ARR) and the Reporting Rate per 10,000 students (R10K). These measures are easily calculated and can be used to assess practices and policies that impact the reporting of sexual assault on campus.

The ARR and R10K are rate comparisons, a method widely used in public health. These rate comparisons measure how ...

What’S Brewing? A Statistics Education Discovery Project, 2017 CUNY Guttman Community College

#### What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg

##### Publications and Research

We believe that students learn best, are actively engaged, and are genuinely interested when working on real-world problems. This can be done by giving students the opportunity to work collaboratively on projects that investigate authentic, familiar problems. This article shares one such project that was used in an introductory statistics course. We describe the steps taken to investigate why customers are charged more for iced coffee than hot coffee, which included collecting data and using descriptive and inferential statistical analysis. Interspersed throughout the article, we describe strategies that can help teachers implement the project and scaffold material to assist students ...

Detecting Discordance Enrichment Among A Series Of Two-Sample Genome-Wide Expression Data Sets, 2017 George Washington University

### Background

With the current microarray and RNA-seq technologies, two-sample genome-wide expression data have been widely collected in biological and medical studies. The related differential expression analysis and gene set enrichment analysis have been frequently conducted. Integrative analysis can be conducted when multiple data sets are available. In practice, discordant molecular behaviors among a series of data sets can be of biological and clinical interest.

### Methods

In this study, a statistical method is proposed for detecting discordance gene set enrichment. Our method is based on a two-level multivariate normal mixture model. It is statistically efficient with linearly increased parameter space when ...

Perennial Warm-Season Grasses For Producing Biofuel And Enhancing Soil Properties: An Alternative To Corn Residue Removal, 2017 University of Nebraska-Lincoln

#### Perennial Warm-Season Grasses For Producing Biofuel And Enhancing Soil Properties: An Alternative To Corn Residue Removal, Humberto Blanco-Canqui, Robert B. Mitchell, Virginia L. Jin, Marty R. Schmer, Kent M. Eskridge

##### Faculty Publications, Department of Statistics

Removal of corn (Zea mays L.) residues at high rates for biofuel and other off-farm uses may negatively impact soil and the environment in the long term. Biomass removal from perennial warm-season grasses (WSGs) grown in marginally-productive lands could be an alternative to corn residue removal as biofuel feedstocks while controlling water and wind erosion, sequestering carbon (C), cycling water and nutrients, and enhancing other soil ecosystem services. We compared wind and water erosion potential, soil compaction, soil hydraulic properties, soil organic C (SOC), and soil fertility between biomass removal from WSGs and corn residue removal from rainfed no-till continuous ...

Impact Of Menthol Smoking On Nicotine Dependence For Diverse Racial/Ethnic Groups Of Daily Smokers, 2017 University of Central Florida

#### Impact Of Menthol Smoking On Nicotine Dependence For Diverse Racial/Ethnic Groups Of Daily Smokers, Julia N. Soulakova, Ryan R. Danczak

##### Faculty Publications, Department of Statistics

Introduction: The aims of this study were to evaluate whether menthol smoking and race/ethnicity are associated with nicotine dependence in daily smokers. Methods: The study used two subsamples of U.S. daily smokers who responded to the 2010–2011 Tobacco Use Supplement to the Current Population Survey. The larger subsample consisted of 18,849 non-Hispanic White (NHW), non-Hispanic Black (NHB), and Hispanic (HISP) smokers. The smaller subsample consisted of 1112 non-Hispanic American Indian/Alaska Native (AIAN), non-Hispanic Asian (ASIAN), non-Hispanic Hawaiian/Pacific Islander (HPI), and non-Hispanic Multiracial (MULT) smokers. Results: For larger (smaller) groups the rates were 45% (33 ...

Evaluating Current Practices In Shelf Life Estimation, 2017 Merck & Co. Inc.

#### Evaluating Current Practices In Shelf Life Estimation, Robert Capen, David Christopher, Patrick Forenzo, Kim Huynh-Ba, David Leblond, Oscar Liu, John O'Neill, Nate Patterson, Michelle Quinlan, Radhika Rajagopalan, James Schwenke, Walter W. Stroup

##### Faculty Publications, Department of Statistics

The current International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) methods for determining the supported shelf life of a drug product, described in ICH guidance documents Q1A and Q1E, are evaluated in this paper. To support this evaluation, an industry data set is used which is comprised of 26 individual stability batches of a common drug product where most batches are measured over a 24 month storage period. Using randomly sampled sets of 3 or 6 batches from the industry data set, the current ICH methods are assessed from three perspectives. First, the distributional properties ...

#### Generalized Confidence Intervals Compatible With The Min Test For Simultaneous Comparisons Of One Subpopulation To Several Other Subpopulations, Julia N. Soulakova

##### Faculty Publications, Department of Statistics

A problem where one subpopulation is compared to several other subpopulations in terms of means with the goal of estimating the smallest difference between the means commonly arises in biology, medicine, and many other scientific fields. A generalization of Strassburger, Bretz and Hochberg (2004) approach for two comparisons is presented for cases with three and more comparisons. The method allows constructing an interval-estimator for the smallest mean difference, which is compatible with the Min test. An application to a fluency-disorder study is illustrated. Simulations confirmed adequate probability coverage for normally distributed outcomes for a number of designs.

Increasing Genomic-Enabled Prediction Accuracy By Modeling Genotype X Environment Interactions In Kansas Wheat, 2017 University of Nebraska-Lincoln

#### Increasing Genomic-Enabled Prediction Accuracy By Modeling Genotype X Environment Interactions In Kansas Wheat, Diego Jarquin, Cristiano Lemas Da Silva, R. Chris Gaynor, Jesse Poland, Allan Fritz, Reka Howard, Sarah Battenfield, José Crossa

##### Faculty Publications, Department of Statistics

Wheat (Triticum aestivum L.) breeding programs test experimental lines in multiple locations over multiple years to get an accurate assessment of grain yield and yield stability. Selections in early generations of the breeding pipeline are based on information from only one or few locations and thus materials are advanced with little knowledge of the genotype × environment interaction (G × E) effects. Later, large trials are conducted in several locations to assess the performance of more advanced lines across environments. Genomic selection (GS) models that include G × E covariates allow us to borrow information not only from related materials, but also from ...

Application Of Response Surface Methods To Determine Conditions For Optimal Genomic Prediction, 2017 University of Nebraska-Lincoln

#### Application Of Response Surface Methods To Determine Conditions For Optimal Genomic Prediction, Reka Howard, Alicia L. Carriquiry, William D. Beavis

##### Faculty Publications, Department of Statistics

An epistatic genetic architecture can have a significant impact on prediction accuracies of genomic prediction (GP) methods. Machine learning methods predict traits comprised of epistatic genetic architectures more accurately than statistical methods based on additive mixed linear models. The differences between these types of GP methods suggest a diagnostic for revealing genetic architectures underlying traits of interest. In addition to genetic architecture, the performance of GP methods may be influenced by the sample size of the training population, the number of QTL, and the proportion of phenotypic variability due to genotypic variability (heritability). Possible values for these factors and the ...

2017 Minnesota State University, Mankato

#### Selection Portfolio: Applying Modern Portfolio Theory To Personnel Selection, Eric Leingang

##### All Theses, Dissertations, and Other Capstone Projects

Modern Portfolio Theory (MPT) is a framework for building a portfolio of risky assets such that the ratio of risk to return is minimized. While this theory has been used in the field of financial economics for over sixty years, the method has not yet been applied to compensatory personnel selection. A common method for personnel selection is multiple regression to maximize the predicted performance of the selected group given a cut-off score on the predictor(s). Recognizing that maximizing the performance of the selected group is not the only consideration, and that, for many jobs and organizations, the outcomes ...

Greek New Testament For Data Analysis, 2016 University of Massachusetts

#### Greek New Testament For Data Analysis, Keith L. Yoder

##### Keith L. Yoder

Updated: 18 December 2017
This Excel data file (compatible with Excel 2007 and later versions) is an extract of my working Greek New Testament database which I use for statistical and data analysis. It originated in the early 2000's from UBS3 data files in beta code I obtained from CCAT, and has since been evolving through countless changes and corrections. A flat-file table display such as Excel 2007+ is the best format suitable for Autofilter and VBA applications, without involving a more complex XML format. The file itself may be opened with Excel 2007 or later versions, or with ...

2016 Washington University in St. Louis

#### A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz

At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with ...

2016 Utah State University

#### Tutorial For Using The Center For High Performance Computing At The University Of Utah And An Example Using Random Forest, Stephen Barton

##### All Graduate Plan B and other Reports

Random Forests are very memory intensive machine learning algorithms and most computers would fail at building models from datasets with millions of observations. Using the Center for High Performance Computing (CHPC) at the University of Utah and an airline on-time arrival dataset with 7 million observations from the U.S. Department of Transportation Bureau of Transportation Statistics we built 316 models by adjusting the depth of the trees and randomness of each forest and compared the accuracy and time each took. Using this dataset we discovered that substantial restrictions to the size of trees, observations allowed for each tree, and ...

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, 2016 Division of Biostatistics, School of Public Health, University of California, Berkeley

#### Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

##### Laura B. Balzer

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the ...

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, 2016 Division of Biostatistics, School of Public Health, University of California, Berkeley

#### Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

##### U.C. Berkeley Division of Biostatistics Working Paper Series

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the ...

Probabilistic Methods In Information Theory, 2016 Cal State University-San Bernardino

#### Probabilistic Methods In Information Theory, Erik W. Pachas

##### Electronic Theses, Projects, and Dissertations

Given a probability space, we analyze the uncertainty, that is, the amount of information of a finite system, by studying the entropy of the system. We also extend the concept of entropy to a dynamical system by introducing a measure preserving transformation on a probability space. After showing some theorems and applications of entropy theory, we study the concept of ergodicity, which helps us to further analyze the information of the system.

Optimal Design Of Low-Density Snp Arrays For Genomic Prediction: Algorithm And Applications, 2016 GeneSeek (a Neogen Company)

#### Optimal Design Of Low-Density Snp Arrays For Genomic Prediction: Algorithm And Applications, Xiao-Lin Wu, Jiaqi Xu, Guofei Feng, George R. Wiggans, Jeremy F. Taylor, Jun He, Changsong Qian, Jiansheng Qiu, Barry Simpson, Jeremy Walker, Stewart Bauck

##### Faculty Publications, Department of Statistics

Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution ...