Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 22 of 22

Full-Text Articles in Physical Sciences and Mathematics

Bayes Multiple Binary Classifier - How To Make Decisions Like A Bayesian, Wensong Wu Nov 2015

Bayes Multiple Binary Classifier - How To Make Decisions Like A Bayesian, Wensong Wu

Mathematics Colloquium Series

This presentation will start by a general introduction of Bayesian statistics, which has become popular in the era of big data. Then we consider a two-class classification problem, where the goal is to predict the class membership of M units based on the values of high-dimensional categorical predictor variables as well as both the values of predictor variables and the class membership of other N independent units. We focus on applying generalized linear regression models with Boolean expressions of categorical predictors. We consider a Bayesian and decision-theoretic framework, and develop a general form of Bayes multiple binary classification functions with …


Mixed Methods Research Designs, Kevin P. Gosselin Phd Oct 2015

Mixed Methods Research Designs, Kevin P. Gosselin Phd

Research Day

No abstract provided.


Sinkhole Vulnerability Mapping: Results From A Pilot Study In North Central Florida, Clint Kromhout, Alan E. Baker Oct 2015

Sinkhole Vulnerability Mapping: Results From A Pilot Study In North Central Florida, Clint Kromhout, Alan E. Baker

Sinkhole Conference 2015

At the end of June in 2012, Tropical Storm Debby dropped a record amount of rainfall across Florida which triggered hundreds, if not thousands, of sinkholes to form which resulted in tremendous damage to property. The Florida Division of Emergency Management contracted with the Florida Department of Environmental Protection’s Florida Geological Survey to produce a map depicting the state’s vulnerability to sinkhole formation. The three-year project began with a pilot study in three northern Florida counties: Columbia, Hamilton and Suwannee. Utilizing the statistical modeling method Weights of Evidence, results from the pilot study yielded a 93 percent success rate of …


Life As An Nfl Statistician, Dennis Lock Sep 2015

Life As An Nfl Statistician, Dennis Lock

Mathematics Colloquium Series

Over the last few years, the fields of statistics and mathematics have become more prevalent and popular in professional sports (with the help of mainstream books and movies like Moneyball). The use of advanced (and non-advanced) statistical methods is growing across the sporting landscape from the front office to the media, and even into business and ticket sales. This talk will discuss Lock’s experiences building an analytics department with the Miami Dolphins as well as the general role of statistics in sports today. It will also including the recent analytics boom in the front office framework, the coinciding need for …


A Nonlinear Filter For Markov Chains And Its Effect On Diffusion Maps, Stefan Steinerberger Sep 2015

A Nonlinear Filter For Markov Chains And Its Effect On Diffusion Maps, Stefan Steinerberger

Yale Day of Data

Diffusion maps are a modern mathematical tool that helps to find structure in large data sets - we present a new filtering technique that is based on the assumption that errors in the data are intrinsically random to isolate and filter errors and thus boost the efficiency of diffusion maps. Applications include data sets from medicine (the Cleveland Heart Disease Data set and the Wisconsin Breast Cancer Data set) and engineering (the Ionosphere data set).


K-Mer Analysis On Developmental And Housekeeping Enhancer Peaks, Yunsi Yang, Anurag Sethi, Mark Gerstein Sep 2015

K-Mer Analysis On Developmental And Housekeeping Enhancer Peaks, Yunsi Yang, Anurag Sethi, Mark Gerstein

Yale Day of Data

The regulation of gene expression involves interaction between transcriptional enhancers and core promoters. However, the separation between developmental and housekeeping gene regulation remains unknown. Here, we present a method to detect if different core promoters exhibit specificity to certain enhancers within massively parallel assays for enhancer detection. We use k-mers of various length (3-8bp) as sequence features and compare k-mer frequencies between developmental and housekeeping enhancers. This method shows promoter specificity of enhancers in D. melanogaster.


Model Selection For Gaussian Mixture Models For Uncertainty Qualification, Yiyi Chen, Guang Lin, Xuan Liu Aug 2015

Model Selection For Gaussian Mixture Models For Uncertainty Qualification, Yiyi Chen, Guang Lin, Xuan Liu

The Summer Undergraduate Research Fellowship (SURF) Symposium

Clustering is task of assigning the objects into different groups so that the objects are more similar to each other than in other groups. Gaussian Mixture model with Expectation Maximization method is the one of the most general ways to do clustering on large data set. However, this method needs the number of Gaussian mode as input(a cluster) so it could approximate the original data set. Developing a method to automatically determine the number of single distribution model will help to apply this method to more larger context. In the original algorithm, there is a variable represent the weight of …


Image Segmentation Using Fuzzy-Spatial Taxon Cut, Lauren Barghout May 2015

Image Segmentation Using Fuzzy-Spatial Taxon Cut, Lauren Barghout

MODVIS Workshop

Images convey multiple meanings that depend on the context in which the viewer perceptually organizes the scene. This presents a problem for automated image segmentation, because it adds uncertainty to the process of selecting which objects to include or not include within a segment. I’ll discuss the implementation of a fuzzy-logic-natural-vision-processing engine that solves this problem by assuming the scene architecture prior to processing. The scene architecture, a standardized natural-scene-perception-taxonomy comprised of a hierarchy of nested spatial-taxons. Spatial-taxons are regions (pixel-sets) that are figure-like, in that they are perceived as having a contour, are either `thing-like', or a `group of …


Video Event Understanding With Pattern Theory, Fillipe Souza, Sudeep Sarkar, Anuj Srivastava, Jingyong Su May 2015

Video Event Understanding With Pattern Theory, Fillipe Souza, Sudeep Sarkar, Anuj Srivastava, Jingyong Su

MODVIS Workshop

We propose a combinatorial approach built on Grenander’s pattern theory to generate semantic interpretations of video events of human activities. The basic units of representations, termed generators, are linked with each other using pairwise connections, termed bonds, that satisfy predefined relations. Different generators are specified for different levels, from (image) features at the bottom level to (human) actions at the highest, providing a rich representation of items in a scene. The resulting configurations of connected generators provide scene interpretations; the inference goal is to parse given video data and generate high-probability configurations. The probabilistic structures are imposed using energies that …


Metacognition: Using Confidence Ratings For Type 2 And Type 1 Roc Curves, S A. Klein May 2015

Metacognition: Using Confidence Ratings For Type 2 And Type 1 Roc Curves, S A. Klein

MODVIS Workshop

In the past five years there has been a surge of renewed interest in metacognition ("thinking about thinking"). The typical experiment involves a binary judgment followed by a multilevel confidence rating. It is a confusing topic because the rating could be made either on one's confidence in the binary response (standard rating Type 1 ROC) or on one's confidence sorted by whether the response was correct (Type 2 ROC). Both are metacognition. After a few remarks on challenging aspects of the Type 2 approach, I will present some interesting results for Type 1 ROC for both memory and vision research. …


Binocular 3d Motion Perception As Bayesian Inference, Martin Lages, Suzanne Heron May 2015

Binocular 3d Motion Perception As Bayesian Inference, Martin Lages, Suzanne Heron

MODVIS Workshop

The human visual system encodes monocular motion and binocular disparity input before it is integrated into a single 3D percept. Here we propose a geometric-statistical model of human 3D motion perception that solves the aperture problem in 3D by assuming that (i) velocity constraints arise from inverse projection of local 2D velocity constraints in a binocular viewing geometry, (ii) noise from monocular motion and binocular disparity processing is independent, and (iii) slower motions are more likely to occur than faster ones. In two experiments we found that instantiation of this Bayesian model can explain perceived 3D line motion direction under …


Model Of Cost-Effectiveness Of Mri For Women Of Average Lifetime Risk Of Breast Cancer, Mckenna L. Kimball Apr 2015

Model Of Cost-Effectiveness Of Mri For Women Of Average Lifetime Risk Of Breast Cancer, Mckenna L. Kimball

Scholarly and Creative Works Conference (2015 - 2021)

Background: Mammography is the current standard for breast cancer detection however magnetic resonance imaging (MRI) is a more sensitive method of breast imaging. Despite MRI’s increased sensitivity, MRI has more false positives and higher costs. The purpose of this study was to determine if MRI or MRI in conjunction with mammography was a cost-effective solution for breast cancer detection in women with average lifetime risk of breast cancer.

Methods: A mathematical model was used to compare annual mammography, annual MRI, and mammography and MRI on alternate years. The model included the natural history of breast cancer, screening by mammography …


Using A Generalized Linear Mixed Model Framework To Account For Spatial Variability In A Comparison Of Orchard Sprayer Efficacy, William J. Price, Bahman Shafii, Don Morishita Apr 2015

Using A Generalized Linear Mixed Model Framework To Account For Spatial Variability In A Comparison Of Orchard Sprayer Efficacy, William J. Price, Bahman Shafii, Don Morishita

Conference on Applied Statistics in Agriculture

Uniform application of pesticides in vineyard and orchard systems can be difficult to achieve due to variability in the density and structure of the crop canopy. Depending on the equipment used and environmental conditions, applications can result in poor spray coverage, spray drift, and wasted spray which, in turn, are manifested as a combination of poor pesticide efficacy, economic losses and potential environmental problems for the grower. A study was therefore designed and carried out to test new sprayer equipment aimed at addressing these issues. Statistically, the study presented a unique replicated three dimensional spatial design which captured response variability …


Session B-2: The “Roll” Of Statistics In Modeling - It All Adds Up, Richard Stalmack, Janice Krouse Feb 2015

Session B-2: The “Roll” Of Statistics In Modeling - It All Adds Up, Richard Stalmack, Janice Krouse

Professional Learning Day

The common core practice standards ask us to teach students to propose mathematical models and test their viability. Participants will do an experiment, collect data and use technological tools to combine modeling, analysis and basic statistics. Participants should bring a laptop, if possible; otherwise, bring a graphing calculator.


Statistical Methods In Topological Data Analysis For Complex, High-Dimensional Data, Patrick S. Medina, R W. Doerge Jan 2015

Statistical Methods In Topological Data Analysis For Complex, High-Dimensional Data, Patrick S. Medina, R W. Doerge

Conference on Applied Statistics in Agriculture

The utilization of statistical methods an their applications within the new field of study known as Topological Data Analysis has has tremendous potential for broadening our exploration and understanding of complex, high-dimensional data spaces. This paper provides an introductory overview of the mathematical underpinnings of Topological Data Analysis, the workflow to convert samples of data to topological summary statistics, and some of the statistical methods developed for performing inference on these topological summary statistics. The intention of this non-technical overview is to motivate statisticians who are interested in learning more about the subject.


Best Linear Unbiased Prediction: An Illustration Based On, But Not Limited To, Shelf Life Estimation, Maryna Ptukhina, Walter Stroup Jan 2015

Best Linear Unbiased Prediction: An Illustration Based On, But Not Limited To, Shelf Life Estimation, Maryna Ptukhina, Walter Stroup

Conference on Applied Statistics in Agriculture

Shelf life estimation procedures, following ICH guidelines, use multiple batch regression with fixed batch effects. This guidance specifically mandates estimates based on at least 3 batches. Technically, the fixed-batch model limits inference to the batches actually observed, whereas ICH requires resulting estimates to apply to all future batches stored under similar conditions. This creates a conflict between the model used and the inference space the model is intended to address. Quinlan, et al. (2013) and Schwenke (2010) studied the small sample behavior of this procedure. Both studies revealed large sampling variation associated with the ICH procedure, producing a substantial proportion …


Shiga Toxin-Producing Escherichia Coli In Meat: A Preliminary Simulation Study On Detection Capabilities For Three Sampling Methods, Julie Couton, David Marx, John Luchaansky, Randall Phebus, Anna Porto-Fett, Nicholas Sevart, Manpreet Singh, Harshavardhan Thippareddi Jan 2015

Shiga Toxin-Producing Escherichia Coli In Meat: A Preliminary Simulation Study On Detection Capabilities For Three Sampling Methods, Julie Couton, David Marx, John Luchaansky, Randall Phebus, Anna Porto-Fett, Nicholas Sevart, Manpreet Singh, Harshavardhan Thippareddi

Conference on Applied Statistics in Agriculture

Contamination by Shiga Toxin-producing Escherichia coli (STEC) is a continuing concern for meat production facility management throughout the United States. Several methods have been used to detect STEC during meat processing, however the excessive experimental cost of determining the optimal method is rarely feasible. The objective of this preliminary simulation study is to determine which sampling method (Cozzini core sampler, core drill shaving, and N-60 surface excision) will better detect STEC at varying levels of contamination present in the meat. 1000 simulated experiments were studied using a binary model for rare occurrences to find the optimal method. We found that …


Differential Methylation Methods In Multi-Context Organisms, Douglas Baumann, Yuqing Su, Iranga Mendis, Gayla R. Olbricht Jan 2015

Differential Methylation Methods In Multi-Context Organisms, Douglas Baumann, Yuqing Su, Iranga Mendis, Gayla R. Olbricht

Conference on Applied Statistics in Agriculture

DNA methylation is an epigenetic modification that has the ability to alter gene expression without any change in the DNA sequence. DNA methylation occurs when a methyl chemical group attaches to cytosine bases on the DNA sequence. In mammals, DNA methylation primarily occurs at CG sites, when a cytosine is followed by a guanine in the DNA sequence. In plants, DNA methylation can also occur in other cytosine sequences, such as when a cytosine is not followed directly by a guanine. Many of the statistical methods that have been developed to estimate methylation levels and test differential methylation in whole-genome …


On Fixed Effects Estimation In Spline-Based Semiparametric Regression For Spatial Data, Guilherme Ludwig, Jun Zhu, Chun-Shu Chen Jan 2015

On Fixed Effects Estimation In Spline-Based Semiparametric Regression For Spatial Data, Guilherme Ludwig, Jun Zhu, Chun-Shu Chen

Conference on Applied Statistics in Agriculture

Spline surfaces are often used to capture spatial variability sources in linear mixed-effects models, without imposing a parametric covariance structure on the random effects. However, including a spline component in a semiparametric model may change the estimated regression coefficients, a problem analogous to spatial confounding in spatially correlated random effects. Our research aims to investigate such effects in spline-based semiparametric regression for spatial data. We discuss estimators' behavior under the traditional spatial linear regression, how the estimates change in spatial confounding-like situations, and how selecting a proper tuning parameter for the spline can help reduce bias.


Small Sample Properties Of The Two Independent Sample Test For Means From Beta Distributions, Edward E. Gbur, Kevin Thompson Jan 2015

Small Sample Properties Of The Two Independent Sample Test For Means From Beta Distributions, Edward E. Gbur, Kevin Thompson

Conference on Applied Statistics in Agriculture

Researchers often collect proportion data that cannot be interpreted as arising from a set of Bernoulli trials. Analyses based on the beta distribution may be appropriate for such data. The SAS® GLIMMIX procedure provides a tool for these analyses using a likelihood based approach within the larger context of generalized linear mixed models (GLMM). The small sample behavior of likelihood based tests to compare the means from two independently sampled beta distributions were studied via simulation when the null hypothesis of equal means holds. Two simulation scenarios were defined by equal and unequal sample sizes and equal scale parameters. A …


Modeling The Occurrence Of Four Cereal Crop Aphid Species In Idaho, John W. Merickel, Bahman Shafii, Sanford D. Eigenbrode, Christopher J. Williams, William J. Price Jan 2015

Modeling The Occurrence Of Four Cereal Crop Aphid Species In Idaho, John W. Merickel, Bahman Shafii, Sanford D. Eigenbrode, Christopher J. Williams, William J. Price

Conference on Applied Statistics in Agriculture

Idaho is ranked 5th in the United States in overall wheat production and makes over $500 million in profit annually from wheat. Many pests have detrimental effects on wheat; some of the most predominant ones are aphids. Four species of aphids having economic effects on wheat crops in Idaho are: Diuraphis noxia, Metopolophium dirhodum, Rhopalosiphum padi, Sitobion avenae. Predictive regression models could be useful for better understanding of the occurrence of these aphid species. Count data for the four species were collected over 17 years via suction traps at 12 locations in wheat fields throughout …


Editor's Preface And Table Of Contents, Perla E. Reyes Jan 2015

Editor's Preface And Table Of Contents, Perla E. Reyes

Conference on Applied Statistics in Agriculture

These proceedings contain papers presented at the twenty-seventh annual Kansas State University Conference on Applied Statistics in Agriculture, held in Manhattan, Kansas, April 26 - April 28, 2015