Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 10 of 10

Full-Text Articles in Physical Sciences and Mathematics

Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell Dec 2017

Seasonal Resource Selection And Habitat Treatment Use By A Fringe Population Of Greater Sage-Grouse, Rhett Boswell

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Movement and habitat selection by Greater Sage-grouse (Centrocercus uropasianus) is of great interest to wildlife managers tasked with applying conservation measures for this iconic western species. Current technology has created small and lightweight GPS (Global Positioning Systems) transmitters that can be attached to sage-grouse. Using GIS software and statistical programs such as Program R, land managers can analyze GPS location data to assess how sage-grouse are geospatially interacting with their habitats. Within the Panguitch Sage-Grouse Management Area (SGMA) thousands of acres of land have been restored or manipulated to enhance sage-grouse habitat; this usually involves removal of pinyon pine …


Exact Approaches For Bias Detection And Avoidance With Small, Sparse, Or Correlated Categorical Data, Sarah E. Schwartz Dec 2017

Exact Approaches For Bias Detection And Avoidance With Small, Sparse, Or Correlated Categorical Data, Sarah E. Schwartz

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Every day, traditional statistical methodology are used world wide to study a variety of topics and provides insight regarding countless subjects. Each technique is based on a distinct set of assumptions to ensure valid results. Additionally, many statistical approaches rely on large sample behavior and may collapse or degenerate in the presence of small, spare, or correlated data. This dissertation details several advancements to detect these conditions, avoid their consequences, and analyze data in a different way to yield trustworthy results.

One of the most commonly used modeling techniques for outcomes with only two possible categorical values (eg. live/die, pass/fail, …


Supervised Classification Using Finite Mixture Copula, Sumen Sen, Norou Diawara Aug 2017

Supervised Classification Using Finite Mixture Copula, Sumen Sen, Norou Diawara

Mathematics & Statistics Faculty Publications

Use of copula for statistical classification is recent and gaining popularity. For example, statistical classification using copula has been proposed for automatic character recognition, medical diagnostic and most recently in data mining. Classical discrimination rules assume normality. But in this data age time, this assumption is often questionable. In fact features of data could be a mixture of discrete and continues random variables. In this paper, mixture copula densities are used to model class conditional distributions. Such types of densities are useful when the marginal densities of the vector of features are not normally distributed and are of a mixed …


Application Of Support Vector Machine Modeling And Graph Theory Metrics For Disease Classification, Jessica M. Rudd Jul 2017

Application Of Support Vector Machine Modeling And Graph Theory Metrics For Disease Classification, Jessica M. Rudd

Published and Grey Literature from PhD Candidates

Disease classification is a crucial element of biomedical research. Recent studies have demonstrated that machine learning techniques, such as Support Vector Machine (SVM) modeling, produce similar or improved predictive capabilities in comparison to the traditional method of Logistic Regression. In addition, it has been found that social network metrics can provide useful predictive information for disease modeling. In this study, we combine simulated social network metrics with SVM to predict diabetes in a sample of data from the Behavioral Risk Factor Surveillance System. In this dataset, Logistic Regression outperformed SVM with ROC index of 81.8 and 81.7 for models with …


The Association Of Calcium Intake And Other Risk Factors With Cardiovascular Disease Among Obese Adults In Usa, Yang Chen, Sheryl Strasser, Katie Callahan, David Blackley, Yan Cao, Liang Wang, Shimin Zheng May 2017

The Association Of Calcium Intake And Other Risk Factors With Cardiovascular Disease Among Obese Adults In Usa, Yang Chen, Sheryl Strasser, Katie Callahan, David Blackley, Yan Cao, Liang Wang, Shimin Zheng

Shimin Zheng

In this study, we used a cross-sectional study design to examine the relationship between the calcium intake and risk factors for CVD among obese adults by using continuous waves of National Health and Nutrition Examination Survey (NHANES) data 1999-2010. The association between calcium intake and risk factors of CVD (hypertension, total cholesterol, HDL, glycohemoglobin), CRP, albuminuria) is assessed among obese adults in USA. The incidence of Cardiovascular Disease (CVD) is high among obese people. The potential effects of inadequate calcium intake on CVD are receiving increased epidemiologic attention. Understanding the association between risk factors for CVD and calcium intake among …


Prevalence Of And Risk Factors For Adolescent Obesity In Tennessee Using The 2010 Youth Risk Behavior Survey (Yrbs) Data: An Analysis Using Weighted Hierarchical Logistic Regression, Shimin Zheng, Nicole Holt, Jodi L. Southerland, Yan Cao, Trevor Taylor, Deborah L. Slawson, Mark Bloodworth May 2017

Prevalence Of And Risk Factors For Adolescent Obesity In Tennessee Using The 2010 Youth Risk Behavior Survey (Yrbs) Data: An Analysis Using Weighted Hierarchical Logistic Regression, Shimin Zheng, Nicole Holt, Jodi L. Southerland, Yan Cao, Trevor Taylor, Deborah L. Slawson, Mark Bloodworth

Shimin Zheng

Background: The rate of adolescent overweight and obesity has more than quadrupled over the past few decades, and has become a major public health problem [1]. In 2011, 55% of 12-19 year olds in the United States (U.S.) were overweight or obese [2]. Adolescence is a pivotal time in which many health risk behaviors such as tobacco, alcohol, and drug use are initiated. Such health risk behaviors have been significantly associated with overweight and obesity among adolescents. Objective: The purpose of this study is to evaluate the relationship between obesity and the health risk behaviors most commonly associated with premature …


Binary Classification On Past Due Of Service Accounts Using Logistic Regression And Decision Tree, Yan Wang, Jennifer L. Priestley Jan 2017

Binary Classification On Past Due Of Service Accounts Using Logistic Regression And Decision Tree, Yan Wang, Jennifer L. Priestley

Published and Grey Literature from PhD Candidates

This paper aims at predicting businesses’ past due in service accounts as well as determining the variables that impact the likelihood of repayment. Two binary classification approaches, logistic regression and the decision tree, were conducted and compared. Both approaches have very good performances with respect to the accuracy. However, the decision tree only uses 10 predictors and reaches an accuracy of 96.69% on the validation set while logistic regression includes 14 predictors and reaches an accuracy of 94.58%. Due to the large concern of false negatives in financial industry, the decision tree technique is a better option than logistic regression …


A Comparison Of Decision Tree With Logistic Regression Model For Prediction Of Worst Non-Financial Payment Status In Commercial Credit, Jessica M. Rudd Mph, Gstat, Jennifer L. Priestley Jan 2017

A Comparison Of Decision Tree With Logistic Regression Model For Prediction Of Worst Non-Financial Payment Status In Commercial Credit, Jessica M. Rudd Mph, Gstat, Jennifer L. Priestley

Published and Grey Literature from PhD Candidates

Credit risk prediction is an important problem in the financial services domain. While machine learning techniques such as Support Vector Machines and Neural Networks have been used for improved predictive modeling, the outcomes of such models are not readily explainable and, therefore, difficult to apply within financial regulations. In contrast, Decision Trees are easy to explain, and provide an easy to interpret visualization of model decisions. The aim of this paper is to predict worst non-financial payment status among businesses, and evaluate decision tree model performance against traditional Logistic Regression model for this task. The dataset for analysis is provided …


Logistic Ensemble Models, Bob Vanderheyden, Jennifer L. Priestley Jan 2017

Logistic Ensemble Models, Bob Vanderheyden, Jennifer L. Priestley

Published and Grey Literature from PhD Candidates

Predictive models that are developed in a regulated industry or a regulated application, like determination of credit worthiness must be interpretable and “rational” (e.g., improvements in basic credit behavior must result in improved credit worthiness scores). Machine Learning technologies provide very good performance with minimal analyst intervention, so they are well suited to a high volume analytic environment but the majority are “black box” tools that provide very limited insight or interpretability into key drivers of model performance or predicted model output values. This paper presents a methodology that blends one of the most popular predictive statistical modeling methods with …


Inference Using Bhattacharyya Distance To Model Interaction Effects When The Number Of Predictors Far Exceeds The Sample Size, Sarah A. Janse Jan 2017

Inference Using Bhattacharyya Distance To Model Interaction Effects When The Number Of Predictors Far Exceeds The Sample Size, Sarah A. Janse

Theses and Dissertations--Statistics

In recent years, statistical analyses, algorithms, and modeling of big data have been constrained due to computational complexity. Further, the added complexity of relationships among response and explanatory variables, such as higher-order interaction effects, make identifying predictors using standard statistical techniques difficult. These difficulties are only exacerbated in the case of small sample sizes in some studies. Recent analyses have targeted the identification of interaction effects in big data, but the development of methods to identify higher-order interaction effects has been limited by computational concerns. One recently studied method is the Feasible Solutions Algorithm (FSA), a fast, flexible method that …