Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Nebraska - Lincoln

Discipline
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 233

Full-Text Articles in Statistics and Probability

Statistical And Machine Learning Approaches To Describe Factors Affecting Preweaning Mortality Of Piglets, Md Towfiqur Rahman, Tami M. Brown-Brandl, Gary A. Rohrer, Sudhendu R. Sharma, Vamsi Manthena, Yeyin Shi Oct 2023

Statistical And Machine Learning Approaches To Describe Factors Affecting Preweaning Mortality Of Piglets, Md Towfiqur Rahman, Tami M. Brown-Brandl, Gary A. Rohrer, Sudhendu R. Sharma, Vamsi Manthena, Yeyin Shi

Biological Systems Engineering: Papers and Publications

High preweaning mortality (PWM) rates for piglets are a significant concern for the worldwide pork industries, causing economic loss and well-being issues. This study focused on identifying the factors affecting PWM, overlays, and predicting PWM using historical production data with statistical and machine learning models. Data were collected from 1,982 litters from the United States Meat Animal Research Center, Nebraska, over the years 2016 to 2021. Sows were housed in a farrowing building with three rooms, each with 20 farrowing crates, and taken care of by well-trained animal caretakers. A generalized linear model was used to analyze the various sow, …


A Classical Fall Statistics Problem, Timothy L. Meyer Oct 2023

A Classical Fall Statistics Problem, Timothy L. Meyer

Cornhusker Economics

An evaluation of traditional baseball measures and suggestions for alternatives, centering on statistics related to the offensive quality of a player.


Exploring Experimental Design And Multivariate Analysis Techniques For Evaluating Community Structure Of Bacteria In Microbiome Data, Kelsey Karnik Aug 2023

Exploring Experimental Design And Multivariate Analysis Techniques For Evaluating Community Structure Of Bacteria In Microbiome Data, Kelsey Karnik

Department of Statistics: Dissertations, Theses, and Student Work

The gut microbiome plays a crucial role in human health, and by working collaboratively with microbiologists, we aim to further our understanding of the human gut and its impact on human health. Promoting a diverse microbiome is emphasized throughout microbiology literature, and involving a statistician in designing experiments to relate gut bacteria and some measured health outcome is crucial for ensuring valid and accurate results. By adopting new experimental design and analysis methods, researchers can begin to gain a deeper understanding of how the genetics of our food affect the composition of taxa within the gut microbiome. This dissertation is …


Pre-Sleep Feeding, Sleep Quality, And Markers Of Recovery In Division I Ncaa Female Soccer Players, Casey E. Greenwalt, Elisa Angeles, Matthew D. Vukovich, Abbie E. Smith-Ryan, Chris W. Bach, Stacy T. Sims, Tucker Zeleny, Kristen E. Holmes, David M. Presby, Katie J. Schiltz, Marine Dupuit, Liliana I. Renteria, Michael J. Ormsbee Jun 2023

Pre-Sleep Feeding, Sleep Quality, And Markers Of Recovery In Division I Ncaa Female Soccer Players, Casey E. Greenwalt, Elisa Angeles, Matthew D. Vukovich, Abbie E. Smith-Ryan, Chris W. Bach, Stacy T. Sims, Tucker Zeleny, Kristen E. Holmes, David M. Presby, Katie J. Schiltz, Marine Dupuit, Liliana I. Renteria, Michael J. Ormsbee

Department of Statistics: Faculty Publications

Pre-sleep nutrition habits in elite female athletes have yet to be evaluated. A retrospective analysis was performed with 14 NCAA Division I female soccer players who wore a WHOOP, Inc. band – a wearable device that quantifies recovery by measuring sleep, activity, and heart rate metrics through actigraphy and photoplethysmography, respectively – 24 h a day for an entire competitive season to measure sleep and recovery. Pre-sleep food consumption data were collected via surveys every 3 days. Average pre-sleep nutritional intake (mean ± sd: kcals 330 ± 284; cho 46.2 ± 40.5 g; pro 7.6 ± 7.3 g; fat 12 …


Increasing Racial Diversity In The North American Plant Phenotyping Network Through Conference Participation Support, David Lebauer, Alexander Bucksch, Jennifer Clarke, Jesse Potts, Sonali Roy May 2023

Increasing Racial Diversity In The North American Plant Phenotyping Network Through Conference Participation Support, David Lebauer, Alexander Bucksch, Jennifer Clarke, Jesse Potts, Sonali Roy

Department of Statistics: Faculty Publications

A key goal of the North American Plant Phenotyping Network (NAPPN) annual conference is to cultivate a new generation of scientists from diverse backgrounds. As part of their effort to diversify the plant phenomics research community, NAPPN acquired funding to cover all attendance costs for participants from historically black colleges and universities (HBCU) for the 2022 annual meeting. Seven award recipients represented the first attendees from HBCUs in the conference’s 6-year history. In this commentary, we report on the impact of the conference awards, including lessons learned, and the future of the award.


Near-Term Effects Of Perennial Grasses On Soil Carbon And Nitrogen In Eastern Nebraska, Salvador Ramirez Ii, Marty R. Schmer, Virginia L. Jin, Robert B. Mitchell, Kent M. Eskridge May 2023

Near-Term Effects Of Perennial Grasses On Soil Carbon And Nitrogen In Eastern Nebraska, Salvador Ramirez Ii, Marty R. Schmer, Virginia L. Jin, Robert B. Mitchell, Kent M. Eskridge

Department of Statistics: Faculty Publications

Incorporating native perennial grasses adjacent to annual row crop systems managed on marginal lands can increase system resiliency by diversifying food and energy production. This study evaluated (1) soil organic C (SOC) and total N stocks (TN) under warm-season grass (WSG) monocultures and a low diversity mixture compared to an adjacent no-till continuous-corn system, and (2) WSG total above-ground biomass (AGB) in response to two levels of N fertilization from 2012 to 2017 in eastern Nebraska, USA. The WSG treatments consisted of (1) switchgrass (SWG), (2) big bluestem (BGB), and (3) low-diversity grass mixture (LDM; big bluestem, Indiangrass, and sideoat …


The Last Drought Frontier: Building A Drought Index For The State Of Alaska, Olivia Campbell May 2023

The Last Drought Frontier: Building A Drought Index For The State Of Alaska, Olivia Campbell

School of Natural Resources: Dissertations, Theses, and Student Research

Drought is characterized by periods of below average precipitation. There are five major types of drought recognized in the literature: meteorological, hydrological, agricultural, socioeconomic, and ecological. A relatively new concept in the drought literature is “snow drought.” A key part of the definition of drought is that it is not always accompanied by extreme heat. This means drought can occur even in cold climates, cold seasons, and higher latitudes and altitudes, like Alaska. Drought is a natural part of climate variability, but Alaska’s climate is changing faster than any other state in the United States. Alaska is no stranger to …


Examining The Effect Of Word Embeddings And Preprocessing Methods On Fake News Detection, Jessica Hauschild May 2023

Examining The Effect Of Word Embeddings And Preprocessing Methods On Fake News Detection, Jessica Hauschild

Department of Statistics: Dissertations, Theses, and Student Work

The words people choose to use hold a lot of power, whether that be in spreading truth or deception. As listeners and readers, we do our best to understand how words are being used. There are many current methods in computer science literature attempting to embed words into numerical information for statistical analyses. Some of these embedding methods, such as Bag of Words, treat words as independent, while others, such as Word2Vec, attempt to gain information about the context of words. It is of interest to compare how well these various methods of translating text into numerical data work specifically …


Integrating And Optimizing Genomic, Weather, And Secondary Trait Data For Multiclass Classification, Vamsi Manthena, Diego Jarquín, Reka Howard Mar 2023

Integrating And Optimizing Genomic, Weather, And Secondary Trait Data For Multiclass Classification, Vamsi Manthena, Diego Jarquín, Reka Howard

Department of Statistics: Faculty Publications

Modern plant breeding programs collect several data types such as weather, images, and secondary or associated traits besides the main trait (e.g., grain yield). Genomic data is high-dimensional and often over-crowds smaller data types when naively combined to explain the response variable. There is a need to develop methods able to effectively combine different data types of differing sizes to improve predictions. Additionally, in the face of changing climate conditions, there is a need to develop methods able to effectively combine weather information with genotype data to predict the performance of lines better. In this work, we develop a novel …


Federated Learning Framework Integrating Refined Cnn And Deep Regression Forests, Daniel Nolte, Omid Bazgir, Souparno Ghosh, Ranadip Pal Mar 2023

Federated Learning Framework Integrating Refined Cnn And Deep Regression Forests, Daniel Nolte, Omid Bazgir, Souparno Ghosh, Ranadip Pal

Department of Statistics: Faculty Publications

Predictive learning from medical data incurs additional challenge due to concerns over privacy and security of personal data. Federated learning, intentionally structured to preserve high level of privacy, is emerging to be an attractive way to generate cross-silo predictions in medical scenarios. However, the impact of severe population-level heterogeneity on federated learners is not well explored. In this article, we propose a methodology to detect presence of population heterogeneity in federated settings and propose a solution to handle such heterogeneity by developing a federated version of Deep Regression Forests. Additionally, we demonstrate that the recently conceptualized REpresentation of Features as …


Federated Learning Framework Integrating Refined Cnn And Deep Regression Forests, Daniel Nolte, Omid Bazgir, Souparno Ghosh, Ranadip Pal Mar 2023

Federated Learning Framework Integrating Refined Cnn And Deep Regression Forests, Daniel Nolte, Omid Bazgir, Souparno Ghosh, Ranadip Pal

Department of Statistics: Faculty Publications

Predictive learning from medical data incurs additional challenge due to concerns over privacy and security of personal data. Federated learning, intentionally structured to preserve high level of privacy, is emerging to be an attractive way to generate cross-silo predictions in medical scenarios. However, the impact of severe population-level heterogeneity on federated learners is not well explored. In this article, we propose a methodology to detect presence of population heterogeneity in federated settings and propose a solution to handle such heterogeneity by developing a federated version of Deep Regression Forests. Additionally, we demonstrate that the recently conceptualized REpresentation of Features as …


Socioeconomic Factors In The Diagnosis And Treatment Of Malignant Melanoma In Hispanic Vs. Non-Hispanic Patients: A National Cancer Database (Ncdb) Study, Julia Griffin, Sarah J. Aurit, Timothy Malouff, Peter Silberstein Mar 2023

Socioeconomic Factors In The Diagnosis And Treatment Of Malignant Melanoma In Hispanic Vs. Non-Hispanic Patients: A National Cancer Database (Ncdb) Study, Julia Griffin, Sarah J. Aurit, Timothy Malouff, Peter Silberstein

Department of Statistics: Faculty Publications

Background: The incidence of melanoma is rapidly increasing in the United States. There is a paucity of research of how melanoma affects the Hispanic population, the quickest growing population.

Objective: To identify and understand how socioeconomic factors affect a Hispanic patients health outcome and treatment of malignant melanoma with comparisons to white, non-Hispanic (WNH) patients.

Methods: A retrospective study utilizing the National Cancer Database (NCDB) was completed investigating Hispanic patients (n=2282) and WNH patients (n=190,469) with Stage I-IV malignant melanoma. Outcome and socioeconomic variables were identified and compared across groups. Data was analyzed with SPSS and SAS …


Estimating The Prevalence Of Two Or More Diseases Using Outcomes From Multiplex Group Testing, Md S. Warasi, Joshua M. Tebbs, Christopher S. Mcmahan, Christopher R. Bilder Mar 2023

Estimating The Prevalence Of Two Or More Diseases Using Outcomes From Multiplex Group Testing, Md S. Warasi, Joshua M. Tebbs, Christopher S. Mcmahan, Christopher R. Bilder

Department of Statistics: Faculty Publications

When screening a population for infectious diseases, pooling individual specimens (e.g., blood, swabs, urine, etc.) can provide enormous cost savings when compared to testing specimens individually. In the biostatistics literature, testing pools of specimens is commonly known as group testing or pooled testing. Although estimating a population-level prevalence with group testing data has received a large amount of attention, most of this work has focused on applications involving a single disease, such as human immunodeficiency virus. Modern methods of screening now involve testing pools and individuals for multiple diseases simultaneously through the use of multiplex assays. Hou et al. (2017, …


Penguins Go Parallel: A Grammar Of Graphics Framework For Generalized Parallel Coordinate Plots, Susan Vanderplas, Yawei Ge, Antony Unwin, Heike Hofmann Mar 2023

Penguins Go Parallel: A Grammar Of Graphics Framework For Generalized Parallel Coordinate Plots, Susan Vanderplas, Yawei Ge, Antony Unwin, Heike Hofmann

Department of Statistics: Faculty Publications

Parallel Coordinate Plots (PCP) are a valuable tool for exploratory data analysis of high-dimensional numerical data. The use of PCPs is limited when working with categorical variables or a mix of categorical and continuous variables. In this article, we propose Generalized Parallel Coordinate Plots (GPCP) to extend the ability of PCPs from just numeric variables to dealing seamlessly with a mix of categorical and numeric variables in a single plot. In this process we find that existing solutions for categorical values only, such as hammock plots or parsets become edge cases in the new framework. By focusing on individual observations …


Viscoelastic Properties Of Human Facial Skin And Comparisons With Facial Prosthetic Elastomers, Mark W. Beatty, Alvin G. Wee, D. B. Marx, Lauren Ridgway, Bobby Simetich, Thiago Carvalho De Sousa, Kevin Vakilzadian, Joel Schulte Feb 2023

Viscoelastic Properties Of Human Facial Skin And Comparisons With Facial Prosthetic Elastomers, Mark W. Beatty, Alvin G. Wee, D. B. Marx, Lauren Ridgway, Bobby Simetich, Thiago Carvalho De Sousa, Kevin Vakilzadian, Joel Schulte

Department of Statistics: Faculty Publications

Prosthesis discomfort and a lack of skin-like quality is a source of patient dissatisfaction with facial prostheses. To engineer skin-like replacements, knowledge of the differences between facial skin properties and those for prosthetic materials is essential. This project measured six viscoelastic properties (percent laxity, stiffness, elastic deformation, creep, absorbed energy, and percent elasticity) at six facial locations with a suction device in a human adult population equally stratified for age, sex, and race. The same properties were measured for eight facial prosthetic elastomers currently available for clinical usage. The results showed that the prosthetic materials were 1.8 to 6.4 times …


Early Detection Of Covid-19 In Female Athletes Using Wearable Technology, Liliana I. Rentería, Casey E. Greenwalt, Sarah Johnson, Shiloah Shiloah Kviatkovsky, Marine Dupuit, Elisa Angeles, Sachin Narayanan, Tucker Zeleny, Michael J. Ormsbee Jan 2023

Early Detection Of Covid-19 In Female Athletes Using Wearable Technology, Liliana I. Rentería, Casey E. Greenwalt, Sarah Johnson, Shiloah Shiloah Kviatkovsky, Marine Dupuit, Elisa Angeles, Sachin Narayanan, Tucker Zeleny, Michael J. Ormsbee

Department of Statistics: Faculty Publications

Background: Heart rate variability (HRV), respiratory rate (RR), and resting heart rate (RHR) are common variables measured by wrist-worn activity trackers to monitor health, fitness, and recovery in athletes. Variations in RR are observed in lower-respiratory infections, and preliminary data suggest changes in HRV and RR are linked to early detection of COVID-19 infection in nonathletes.

Hypothesis: Wearable technology measuring HRV, RR, RHR, and recovery will be successful for early detection of COVID-19 in NCAA Division I female athletes.

Study Design: Cohort study.

Level of Evidence: Level 2.

Methods: Female athletes wore WHOOP, Inc. bands …


Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi Jan 2023

Utilizing Markov Chains To Estimate Allele Progression Through Generations, Ronit Gandhi

Honors Theses

All populations display patterns in allele frequencies over time. Some alleles cease to exist, while some grow to become the norm. These frequencies can shift or stay constant based on the conditions the population lives in. If in Hardy-Weinberg equilibrium, the allele frequencies stay constant. Most populations, however, have bias from environmental factors, sexual preferences, other organisms, etc. We propose a stochastic Markov chain model to study allele progression across generations. In such a model, the allele frequencies in the next generation depend only on the frequencies in the current one.

We use this model to track a recessive allele …


Socio‑Economic Inequalities In Minimum Dietary Diversity Among Bangladeshi Children Aged 6–23 Months: A Decomposition Analysis, Satyajit Kundu, Pranta Das, Ashfikur Rahman, Hasan Al Banna, Kaniz Fatema, Akhtarul Islam, Shobhit Srivastava, T. Muhammad, Rakhi Dey, Ahmed Hossain Dec 2022

Socio‑Economic Inequalities In Minimum Dietary Diversity Among Bangladeshi Children Aged 6–23 Months: A Decomposition Analysis, Satyajit Kundu, Pranta Das, Ashfikur Rahman, Hasan Al Banna, Kaniz Fatema, Akhtarul Islam, Shobhit Srivastava, T. Muhammad, Rakhi Dey, Ahmed Hossain

Department of Statistics: Faculty Publications

This study aimed to measure the socio-economic inequalities in having minimum dietary diversity (MDD) among Bangladeshi children aged 6–23 months as well as to determine the factors that potentially contribute to the inequity. The Bangladesh Demographic and Health Survey (BDHS) 2017–2018 data were used in this study. A sample of 2405 (weighted) children aged 6–23 months was included. The overall weighted prevalence of MDD was 37.47%. The concentration index (CIX) value for inequalities in MDD due to wealth status was positive and the concentration curve lay below the line of equality (CIX: 0.1211, p < 0.001), where 49.47% inequality was contributed by wealth status, 25.06% contributed by the education level of mother, and 20.41% contributed by the number of ante-natal care (ANC) visits. Similarly, the CIX value due to the education level of mothers was also positive and the concentration curve lay below the line of equality (CIX: 0.1341, p < 0.001), where 52.68% inequality was contributed by the education level of mother, 18.07% contributed by wealth status, and 14.69% contributed by the number of ANC visits. MDD was higher among higher socioeconomic status (SES) groups. Appropriate intervention design should prioritize minimizing socioeconomic inequities in MDD, especially targeting the contributing factors of these inequities.


Identification Of Disease Resistance Parents And Genome-Wide Association Mapping Of Resistance In Spring Wheat, Muhammad Iqbal, Kassa Semagn, Diego Jarquin, Harpinder Randhawa, Brent D. Mccallum, Reka Howard, Reem Aboukhaddour, Izabela Ciechanowska, Klaus Strenzke, José Crossa, J. Jesus Céron-Rojas, Amidou N’Diaye, Curtis Pozniak, Dean Spaner Oct 2022

Identification Of Disease Resistance Parents And Genome-Wide Association Mapping Of Resistance In Spring Wheat, Muhammad Iqbal, Kassa Semagn, Diego Jarquin, Harpinder Randhawa, Brent D. Mccallum, Reka Howard, Reem Aboukhaddour, Izabela Ciechanowska, Klaus Strenzke, José Crossa, J. Jesus Céron-Rojas, Amidou N’Diaye, Curtis Pozniak, Dean Spaner

Department of Statistics: Faculty Publications

The likelihood of success in developing modern cultivars depend on multiple factors, including the identification of suitable parents to initiate new crosses, and characterizations of genomic regions associated with target traits. The objectives of the present study were to (a) determine the best economic weights of four major wheat diseases (leaf spot, common bunt, leaf rust, and stripe rust) and grain yield for multi-trait restrictive linear phenotypic selection index (RLPSI), (b) select the top 10% cultivars and lines (hereafter referred as genotypes) with better resistance to combinations of the four diseases and acceptable grain yield as potential parents, and (c) …


Evaluating Dimensionality Reduction For Genomic Prediction, Vamsi Manthena, Diego Jarquín, Rajeev K. Varshney, Manish Roorkiwal, Girish Prasad Dixit, Chellapilla Bharadwaj, Reka Howard Oct 2022

Evaluating Dimensionality Reduction For Genomic Prediction, Vamsi Manthena, Diego Jarquín, Rajeev K. Varshney, Manish Roorkiwal, Girish Prasad Dixit, Chellapilla Bharadwaj, Reka Howard

Department of Statistics: Faculty Publications

The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statistical models. In this paper, we investigated the utility of applying dimensionality reduction (DR) methods as a pre-processing step for GS methods. We compared five DR methods and studied the trend in the prediction accuracies of each method as a function of the number of features retained. The effect of DR methods was studied using three models that involved the …


Evaluating Dimensionality Reduction For Genomic Prediction, Vamsi Manthena, Diego Jarquín, Rajeev K. Varshney, Manish Roorkiwal, Girish Prasad Dixit, Chellapilla Bharadwaj, Reka Howard Oct 2022

Evaluating Dimensionality Reduction For Genomic Prediction, Vamsi Manthena, Diego Jarquín, Rajeev K. Varshney, Manish Roorkiwal, Girish Prasad Dixit, Chellapilla Bharadwaj, Reka Howard

Department of Statistics: Faculty Publications

The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statistical models. In this paper, we investigated the utility of applying dimensionality reduction (DR) methods as a pre-processing step for GS methods. We compared five DR methods and studied the trend in the prediction accuracies of each method as a function of the number of features retained. The effect of DR methods was studied using three models that involved the …


Bayesian Analysis For The Lomax Model Using Noninformative Priors, Daojiang He, Dongchu Sun, Qing Zhu Oct 2022

Bayesian Analysis For The Lomax Model Using Noninformative Priors, Daojiang He, Dongchu Sun, Qing Zhu

Department of Statistics: Faculty Publications

The Lomax distribution is an important member in the distribution family. In this paper, we systematically develop an objective Bayesian analysis of data from a Lomax distribution. Noninformative priors, including probability matching priors, the maximal data information (MDI) prior, Jeffreys prior and reference priors, are derived. The propriety of the posterior under each prior is subsequently validated. It is revealed that the MDI prior and one of the reference priors yield improper posteriors, and the other reference prior is a second-order probability matching prior. A simulation study is conducted to assess the frequentist performance of the proposed Bayesian approach. Finally, …


Combining Phenotypic And Genomic Data To Improve Prediction Of Binary Traits, Diego Jarquin, Arkaprava Roy, Bertrand S. Clarke, Subhashis Ghosal Sep 2022

Combining Phenotypic And Genomic Data To Improve Prediction Of Binary Traits, Diego Jarquin, Arkaprava Roy, Bertrand S. Clarke, Subhashis Ghosal

Department of Statistics: Faculty Publications

Plant breeders want to develop cultivars that outperform existing genotypes. Some characteristics (here ‘main traits’) of these cultivars are categorical and difficult to measure directly. It is important to predict the main trait of newly developed genotypes accurately. In addition to marker data, breeding programs often have information on secondary traits (or ‘phenotypes’) that are easy to measure. Our goal is to improve prediction of main traits with interpretable relations by combining the two data types using variable selection techniques. However, the genomic characteristics can overwhelm the set of secondary traits, so a standard technique may fail to select any …


Comparative Antiplatelet Effects Of Chlorthalidone And Hydrochlorothiazide, Khalid Bashir, Tammy Burns, Samuel J. Pirruccello, Sarah J. Aurit Aug 2022

Comparative Antiplatelet Effects Of Chlorthalidone And Hydrochlorothiazide, Khalid Bashir, Tammy Burns, Samuel J. Pirruccello, Sarah J. Aurit

Department of Statistics: Faculty Publications

Chlorthalidone (CTD) may be superior to hydrochlorothiazide (HCTZ) in the reduction of adverse cardiovascular events in hypertensive patients. The mechanism of the potential benefit of CTD could be related to antiplatelet effects. The objective of this study was to determine if CTD or HCTZ have antiplatelet effects. This study was a prospective, double-blind, randomized, three-way crossover comparison evaluating the antiplatelet effects of CTD, HCTZ, and aspirin (ASA) in healthy volunteers. The effects of these treatments on platelet activation and aggregation were assessed using a well-established method with five standard platelet agonists. Thirty-four patients completed the three-way crossover comparing pre- and …


Human Perception Of Exponentially Increasing Data Displayed On A Log Scale Evaluated Through Experimental Graphics Tasks, Emily Robinson Aug 2022

Human Perception Of Exponentially Increasing Data Displayed On A Log Scale Evaluated Through Experimental Graphics Tasks, Emily Robinson

Department of Statistics: Dissertations, Theses, and Student Work

Log scales are often used to display data over several orders of magnitude within one graph. We conducted a series of three graphical studies to evaluate the impact displaying data on the log scale has on human perception of exponentially increasing trends compared to displaying data on the linear scale. Each study was related to a different graphical task, each requiring a different level of interaction and cognitive use of the data being presented. The first experiment evaluated whether our ability to perceptually notice differences in exponentially increasing trends is impacted by the choice of scale. Participants were shown a …


Spatio-Temporal Models Of Infectious Disease With High Rates Of Asymptomatic Transmission, Aminur Rahman, Angela Peace, Ramesh Kesawan, Souparno Ghosh Jul 2022

Spatio-Temporal Models Of Infectious Disease With High Rates Of Asymptomatic Transmission, Aminur Rahman, Angela Peace, Ramesh Kesawan, Souparno Ghosh

Department of Statistics: Faculty Publications

The surprisingly mercurial Covid-19 pandemic has highlighted the need to not only accelerate research on infectious disease, but to also study them using novel techniques and perspectives. A major contributor to the dificulty of containing the current pandemic is due to the highly asymptomatic nature of the disease. In this investigation, we develop a modeling framework to study the spatio-temporal evolution of diseases with high rates of asymptomatic transmission, and we apply this framework to a hypothetical country with mathematically tractable geography; namely, square counties uniformly organized into a rectangle. We first derive a model for the temporal dynamics of …


Reply To Response By Fbi Laboratory Filed In Illinois V. Winfield And Affidavit By Biederman Et Al. (2022) Filed In Us V. Kaevon Sutton (2018 Cf1 009709), Susan Vanderplas, Kori Khan, Heike Hofmann, Alicia Carriquiry Jul 2022

Reply To Response By Fbi Laboratory Filed In Illinois V. Winfield And Affidavit By Biederman Et Al. (2022) Filed In Us V. Kaevon Sutton (2018 Cf1 009709), Susan Vanderplas, Kori Khan, Heike Hofmann, Alicia Carriquiry

Department of Statistics: Faculty Publications

1 Preliminaries

1.1 Scope

The aim of this document is to respond to issues raised in Federal Bureau of Investigation1 and Alex Biedermann, Bruce Budowle & Christophe Champod.2

1.2 Conflict of Interest

We are statisticians employed at public institutions of higher education (Iowa State University and University of Nebraska, Lincoln) and have not been paid for our time or expertise when preparing either this response or the original affidavit.3 We provide this information as a public service and as scientists and researchers in this area.

1.3 Organization

The rest of the document precedes as follows: we begin …


Genomic Prediction Accuracy Of Stripe Rust In Six Spring Wheat Populations By Modeling Genotype By Environment Interaction, Kassa Semagn, Muhammad Iqbal, Diego Jarquin, Harpinder Randhawa, Reem Aboukhaddour, Reka Howard, Izabela Ciechanowska, Momna Farzand, Raman Dhariwal, Colin W. Hiebert, Amidou N’Diaye, Curtis Pozniak, Dean Spaner Jun 2022

Genomic Prediction Accuracy Of Stripe Rust In Six Spring Wheat Populations By Modeling Genotype By Environment Interaction, Kassa Semagn, Muhammad Iqbal, Diego Jarquin, Harpinder Randhawa, Reem Aboukhaddour, Reka Howard, Izabela Ciechanowska, Momna Farzand, Raman Dhariwal, Colin W. Hiebert, Amidou N’Diaye, Curtis Pozniak, Dean Spaner

Department of Statistics: Faculty Publications

Some previous studies have assessed the predictive ability of genome-wide selection on stripe (yellow) rust resistance in wheat, but the effect of genotype by environment interaction (GEI) in prediction accuracies has not been well studied in diverse genetic backgrounds. Here, we compared the predictive ability of a model based on phenotypic data only (M1), the main effect of phenotype and molecular markers (M2), and a model that incorporated GEI (M3) using three cross-validations (CV1, CV2, and CV0) scenarios of interest to breeders in six spring wheat populations. Each population was evaluated at three to eight field nurseries and genotyped with …


Comparing Artificial-Intelligence Techniques With State-Of-The-Art Parametric Prediction Models For Predicting Soybean Traits, Susweta Ray, Diego Jarquin, Reka Howard May 2022

Comparing Artificial-Intelligence Techniques With State-Of-The-Art Parametric Prediction Models For Predicting Soybean Traits, Susweta Ray, Diego Jarquin, Reka Howard

Department of Statistics: Faculty Publications

Soybean [Glycine max (L.) Merr.] is a significant source of protein and oil and is also widely used as animal feed. Thus, developing lines that are superior in terms of yield, protein, and oil content is important to feed the ever-growing population. As opposed to high-cost phenotyping, genotyping is both cost and time efficient for breeders because evaluating new lines in different environments (location–year combinations) can be costly. Several genomic prediction (GP) methods have been developed to use the marker and environment data effectively to predict the yield or other relevant phenotypic traits of crops. Our study compares a conventional …


Split Classification Model For Complex Clustered Data, Katherine Gerot Mar 2022

Split Classification Model For Complex Clustered Data, Katherine Gerot

Honors Theses

Classification in high-dimensional data has generated tremendous interest in a multitude of fields. Data in higher dimensions often tend to reside in non-Euclidean metric space. This prevents Euclidean-based classification methodologies, such as regression, from reliably modeling the data. Many proposed models rely on computationally-complex embedding to convert the data to a more usable format. Others, namely the Support Vector Machine, rely on kernel manipulation to implicitly describe the "feature space" to arrive at a non-linear decision boundary. The proposed methodology in this paper seeks to classify complex data in a relatively computationally-simple and explainable manner.