Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Utah State University

2017

Discipline
Keyword
Publication
Publication Type

Articles 1 - 21 of 21

Full-Text Articles in Statistics and Probability

Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell Dec 2017

Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Movement and habitat selection by Greater Sage-grouse (Centrocercus uropasianus) is of great interest to wildlife managers tasked with applying conservation measures for this iconic western species. Current technology has created small and lightweight GPS (Global Positioning Systems) transmitters that can be attached to sage-grouse. Using GIS software and statistical programs such as Program R, land managers can analyze GPS location data to assess how sage-grouse are geospatially interacting with their habitats. Within the Panguitch Sage-Grouse Management Area (SGMA) thousands of acres of land have been restored or manipulated to enhance sage-grouse habitat; this usually involves removal of pinyon pine …


Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai Dec 2017

Novel Statistical Models For Quantitative Shape-Gene Association Selection, Xiaotian Dai

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Other research reported that genetic mechanism plays a major role in the development process of biological shapes. The primary goal of this dissertation is to develop novel statistical models to investigate the quantitative relationships between biological shapes and genetic variants. However, these problems can be extremely challenging to traditional statistical models for a number of reasons: 1) the biological phenotypes cannot be effectively represented by single-valued traits, while traditional regression only handles one dependent variable; 2) in real-life genetic data, the number of candidate genes to be investigated is extremely large, and the signal-to-noise ratio of candidate genes is expected …


Exact Approaches For Bias Detection And Avoidance With Small, Sparse, Or Correlated Categorical Data, Sarah E. Schwartz Dec 2017

Exact Approaches For Bias Detection And Avoidance With Small, Sparse, Or Correlated Categorical Data, Sarah E. Schwartz

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Every day, traditional statistical methodology are used world wide to study a variety of topics and provides insight regarding countless subjects. Each technique is based on a distinct set of assumptions to ensure valid results. Additionally, many statistical approaches rely on large sample behavior and may collapse or degenerate in the presence of small, spare, or correlated data. This dissertation details several advancements to detect these conditions, avoid their consequences, and analyze data in a different way to yield trustworthy results.

One of the most commonly used modeling techniques for outcomes with only two possible categorical values (eg. live/die, pass/fail, …


Extracting And Visualizing Data From Mobile And Static Eye Trackers In R And Matlab, Chunyang Li Dec 2017

Extracting And Visualizing Data From Mobile And Static Eye Trackers In R And Matlab, Chunyang Li

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Eye tracking is the process of measuring where people are looking at with an eye tracker device. Eye tracking has been used in many scientific fields, such as education, usability research, sports, psychology, and marketing. Eye tracking data are often obtained from a static eye tracker or are manually extracted from a mobile eye tracker. Visualization usually plays an important role in the analysis of eye tracking data. So far, there existed no software package that contains a whole collection of eye tracking data processing and visualization tools. In this dissertation, we review the eye tracking technology, the eye tracking …


Application Of Machine Learning And Statistical Learning Methods For Prediction In A Large-Scale Vegetation Map, Carla M. Brookey Dec 2017

Application Of Machine Learning And Statistical Learning Methods For Prediction In A Large-Scale Vegetation Map, Carla M. Brookey

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Original analyses of a large vegetation cover dataset from Roosevelt National Forest in northern Colorado were carried out by Blackard (1998) and Blackard and Dean (1998; 2000). They compared the classification accuracies of linear and quadratic discriminant analysis (LDA and QDA) with artificial neural networks (ANN) and obtained an overall classification accuracy of 70.58% for a tuned ANN compared to 58.38% for LDA and 52.76% for QDA.

Because there has been tremendous development of machine learning classification methods over the last 35 years in both computer science and statistics, as well as substantial improvements in the speed of computer hardware, …


Using Data To Improve Services For Infants With Hearing Loss: Linking Newborn Hearing Screening Records With Early Intervention Records, Maria Gonzalez, Lori Iarossi, Yan Wu, Ying Huang, Kirsten Siegenthaler Nov 2017

Using Data To Improve Services For Infants With Hearing Loss: Linking Newborn Hearing Screening Records With Early Intervention Records, Maria Gonzalez, Lori Iarossi, Yan Wu, Ying Huang, Kirsten Siegenthaler

Journal of Early Hearing Detection and Intervention

The purpose of this study was to match records of infants with permanent hearing loss from the New York Early Hearing Detection and Intervention Information System (NYEHDI-IS) to records of infants with permanent hearing loss receiving early intervention services from the New York State Early Intervention Program (NYSEIP) to identify areas in the state where hearing screening, diagnostic evaluations and referrals to the NYSEIP were not being made or documented in a timely manner. Data from 2014-2016 NYEHDI-IS and NYEIS information systems were matched using The Link King. There were 274 infants documented in NYEIS Information System as receiving early …


A Bivariate Hypothesis Testing Approach For Mapping The Trait-Influential Gene, Garrett Saunders, Matthew D. Meng, John R. Stevens Oct 2017

A Bivariate Hypothesis Testing Approach For Mapping The Trait-Influential Gene, Garrett Saunders, Matthew D. Meng, John R. Stevens

Mathematics and Statistics Faculty Publications

The linkage disequilibrium (LD) based quantitative trait loci (QTL) model involves two indispensable hypothesis tests: the test of whether or not a QTL exists, and the test of the LD strength between the QTaL and the observed marker. The advantage of this two-test framework is to test whether there is an influential QTL around the observed marker instead of just having a QTL by random chance. There exist unsolved, open statistical questions about the inaccurate asymptotic distributions of the test statistics. We propose a bivariate null kernel (BNK) hypothesis testing method, which characterizes the joint distribution of the two test …


Imputation For Random Forests, Joshua Young Aug 2017

Imputation For Random Forests, Joshua Young

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

This project introduces two new methods for imputation of missing data in random forests. The new methods are compared against other frequently used imputation methods, including those used in the randomForest package in R. To test the effectiveness of these methods, missing data are imputed into datasets that contain two missing data mechanisms including missing at random and missing completely at random. After imputation, random forests are run on the data and accuracies for the predictions are obtained. Speed is an important aspect in computing; the speeds for all the tested methods are also compared.

One of the new methods …


Tree-Based Regression For Interval-Valued Data, Chih-Ching Yeh Aug 2017

Tree-Based Regression For Interval-Valued Data, Chih-Ching Yeh

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Regression methods for interval-valued data have been increasingly studied in recent years. As most of the existing works focus on linear models, it is important to note that many problems in practice are nonlinear in nature and therefore development of nonlinear regression tools for intervalvalued data is crucial. In this project, we propose a tree-based regression method for interval-valued data, which is well applicable to both linear and nonlinear problems. Unlike linear regression models that usually require additional constraints to ensure positivity of the predicted interval length, the proposed method estimates the regression function in a nonparametric way, so the …


Prediction Of Stress Increase In Unbonded Tendons Using Sparse Principal Component Analysis, Eric Mckinney Aug 2017

Prediction Of Stress Increase In Unbonded Tendons Using Sparse Principal Component Analysis, Eric Mckinney

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

While internal and external unbonded tendons are widely utilized in concrete structures, the analytic solution for the increase in unbonded tendon stress, Δ���, is challenging due to the lack of bond between strand and concrete. Moreover, most analysis methods do not provide high correlation due to the limited available test data. In this thesis, Principal Component Analysis (PCA), and Sparse Principal Component Analysis (SPCA) are employed on different sets of candidate variables, amongst the material and sectional properties from the database compiled by Maguire et al. [18]. Predictions of Δ��� are made via Principal Component Regression models, and the method …


A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen Aug 2017

A Comparison Of Five Statistical Methods For Predicting Stream Temperature Across Stream Networks, Maike F. Holthuijzen

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

The health of freshwater aquatic systems, particularly stream networks, is mainly influenced by water temperature, which controls biological processes and influences species distributions and aquatic biodiversity. Thermal regimes of rivers are likely to change in the future, due to climate change and other anthropogenic impacts, and our ability to predict stream temperatures will be critical in understanding distribution shifts of aquatic biota. Spatial statistical network models take into account spatial relationships but have drawbacks, including high computation times and data pre-processing requirements. Machine learning techniques and generalized additive models (GAM) are promising alternatives to the SSN model. Two machine learning …


Physiological Health Parameters Among College Students To Promote Chronic Disease Prevention And Health Promotion, David R. Black, Daniel C. Coster, Samantha R. Paige May 2017

Physiological Health Parameters Among College Students To Promote Chronic Disease Prevention And Health Promotion, David R. Black, Daniel C. Coster, Samantha R. Paige

Mathematics and Statistics Faculty Publications

This study aimed to provide physiologic health risk parameters by gender and age among college students enrolled in a U.S. Midwestern University to promote chronic disease prevention and ameliorate health. A total of 2615 college students between 18 and 25 years old were recruited annually using a series of cross-sectional designs during the spring semester over an 8-year period. Physiologic parameters measured included body mass index (BMI), percentage body fat (%BF), blood serum cholesterol (BSC), and systolic (SBP) and diastolic (DBP) blood pressure. These measures were compared to data from NHANES to identify differences in physiologic parameters among 18-25 year …


Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers May 2017

Comparison Of Survival Curves Between Cox Proportional Hazards, Random Forests, And Conditional Inference Forests In Survival Analysis, Brandon Weathers

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Survival analysis methods are a mainstay of the biomedical fields but are finding increasing use in other disciplines including finance and engineering. A widely used tool in survival analysis is the Cox proportional hazards regression model. For this model, all the predicted survivor curves have the same basic shape, which may not be a good approximation to reality. In contrast the Random Survival Forests does not make the proportional hazards assumption and has the flexibility to model survivor curves that are of quite different shapes for different groups of subjects. We applied both techniques to a number of publicly available …


Statistical Methods For Assessing Individual Oocyte Viability Through Gene Expression Profiles, Michael O. Bishop May 2017

Statistical Methods For Assessing Individual Oocyte Viability Through Gene Expression Profiles, Michael O. Bishop

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Abstract

Statistical Methods for Assessing Individual Oocyte Viability Through Gene Expression Profiles

By

Michael O. Bishop

Utah State University, 2017

Major Professor: Dr. John R. Stevens

Department: Mathematics and Statistics

Oocytes are the precursor cells to the female gamete, or egg. While reproduction may vary from species to species, within humans and most domesticated animals, the oocyte maturation process is fairly similar. As an oocyte matures, there are various processes that take place, all of which have an effect on the viability of the individual oocyte. Barring outside damage that may come to the oocyte, one of the primary reasons …


A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone May 2017

A Comparison Of Statistical Methods Relating Pairwise Distance To A Binary Subject-Level Covariate, Rachael Stone

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

A community ecologist provided a motivating data set involving a certain animal species with two behavior groups, along with a pairwise genetic distance matrix among individuals. Many community ecologists have analyzed similar data sets with a method known as the Hopkins method, testing for an association between the subject-level covariate (behavior group) and the pairwise distance. This community ecologist wanted to know if they used the Hopkins method, would their results be meaningful? Their question inspired this thesis work, where a different data set was used for confidentiality reasons. Multiple methods (Hopkins method, ADONIS, ANOSIM, and Distance Regression) were used …


A Tournament Approach To Price Discovery In The Us Cattle Market, Jeffrey Wright May 2017

A Tournament Approach To Price Discovery In The Us Cattle Market, Jeffrey Wright

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Cattle price discovery is a process of determining the price in the market through the interactions of cattle buyers (packers) and sellers (ranchers). Locating the price discovery center or market, and estimating price interactions among the regional fed cattle markets and also among feeder cattle markets can help define a relevant fed cattle procurement market. This research identifies that the U.S. cattle markets is discovered in the futures markets, feeder cattle futures and fed futures.


Combinatorial Games On Graphs, Trevor K. Williams May 2017

Combinatorial Games On Graphs, Trevor K. Williams

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Combinatorial Games are intriguing and have a tendency to engross students and lead them into a serious study of mathematics. The engaging nature of games is the basis for this thesis. Two combinatorial games and some educational tools are presented which were developed by the author in the pursuit of the solution of these games.


Telephone Polls And Pps Sampling: A Potential Boon To The Polling Industry, Jade Mckay Burt May 2017

Telephone Polls And Pps Sampling: A Potential Boon To The Polling Industry, Jade Mckay Burt

Undergraduate Honors Capstone Projects

In the wake of the 2016 election, the polling industry has no shortage of critics. While these are difficult times for the industry as a whole, there are exciting innovations happening that will serve to benefit and revitalize the industry for years. One of these exciting innovations is Probability Proportional to Size (PPS) sampling. I will elaborate on what PPS sampling is and provide a mathematical foundation for its use in polling. I also discuss what some of the myriad of issues plaguing the polling industry are and then show how PPS sampling can be used to remedy many of …


Regime Switching In Cointegrated Time Series, Bradley David Zynda Ii Apr 2017

Regime Switching In Cointegrated Time Series, Bradley David Zynda Ii

Undergraduate Honors Capstone Projects

Volatile commodities and markets can often be difficult to model and forecast given significant breaks in trends through time. To account such breaks, regime switching methods allow for models to accommodate abrupt changes in behavior of the data. However, the difficulty often arises in beginning the process of choosing a model and its associated parameters with which to represent the data and the objects of interest. To improve model selection for these volatile markets, this research examines time series with regime switching components and argues that a synthesis of vector error correction models with regime switching models with ameliorate financial …


Mass Action In Two-Sex Population Models: Encounters, Mating Encounters And The Associated Numerical Correction, Katherine Snyder, Brynja R. Kohler, Luis F. Gordillo Mar 2017

Mass Action In Two-Sex Population Models: Encounters, Mating Encounters And The Associated Numerical Correction, Katherine Snyder, Brynja R. Kohler, Luis F. Gordillo

Mathematics and Statistics Faculty Publications

Ideal gas models are a paradigm used in Biology for the phenomenological modelling of encounters between individuals of different types. These models have been used to approximate encounter rates given densities, velocities and distance within which an encounter certainly occurs. When using mass action in two-sex populations, however, it is necessary to recognize the difference between encounters and mating encounters. While the former refers in general to the (possibly simultaneous) collisions between particles, the latter represents pair formation that will produce offspring. The classical formulation of the law of mass action does not account this difference. In this short paper, …


Case Study For Guided Project In Stochastic Hydrology, Meghna Babbar-Sebens Jan 2017

Case Study For Guided Project In Stochastic Hydrology, Meghna Babbar-Sebens

All ECSTATIC Materials

Attached are two guided project activities for hydrology and climate data of Eagle Creek Watershed, Indiana, USA. The zip files have flow and precipitation datasets at daily, monthly, and annual time scales.