Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 11 of 11
Full-Text Articles in Physical Sciences and Mathematics
Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett
Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett
All Graduate Plan B and other Reports, Spring 1920 to Spring 2023
Random forests are very popular tools for predictive analysis and data science. They work for both classification (where there is a categorical response variable) and regression (where the response is continuous). Random forests provide proximities, and both local and global measures of variable importance. However, these quantities require special tools to be effectively used to interpret the forest. Rfviz is a sophisticated interactive visualization package and toolkit in R, specially designed for interpreting the results of a random forest in a user-friendly way. Rfviz uses a recently developed R package (loon) from the Comprehensive R Archive Network (CRAN) to create …
Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John
Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John
SMU Data Science Review
In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age, …
Deep Machine Learning For Mechanical Performance And Failure Prediction, Elijah Reber, Nickolas D. Winovich, Guang Lin
Deep Machine Learning For Mechanical Performance And Failure Prediction, Elijah Reber, Nickolas D. Winovich, Guang Lin
The Summer Undergraduate Research Fellowship (SURF) Symposium
Deep learning has provided opportunities for advancement in many fields. One such opportunity is being able to accurately predict real world events. Ensuring proper motor function and being able to predict energy output is a valuable asset for owners of wind turbines. In this paper, we look at how effective a deep neural network is at predicting the failure or energy output of a wind turbine. A data set was obtained that contained sensor data from 17 wind turbines over 13 months, measuring numerous variables, such as spindle speed and blade position and whether or not the wind turbine experienced …
Association Tests For Genetic Effect And Its Interaction With Environmental Factors, Zhengyang Zhou
Association Tests For Genetic Effect And Its Interaction With Environmental Factors, Zhengyang Zhou
Statistical Science Theses and Dissertations
My research is in the area of statistical genetics, and it contains three projects: (1) Differentiating the Cochran-Armitage (CA) trend test and Pearson’s chi-square test: location and dispersion; (2) Decomposing Pearson’s chi-square test: a linear regression and its departure from linearity; (3) Testing nonlinear gene-environment (GxE) interaction through varying coefficient and linear mixed models.
(1) In genetic case-control association studies, a standard practice is to perform the CA trend test with 1 degree-of-freedom (df) under the assumption of an additive model. However, when the true genetic model is recessive or near recessive, it is outperformed by Pearson’s chi-square test with …
Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell
Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell
Undergraduate Theses and Capstone Projects
To the outside observer, soccer is chaotic with no given pattern or scheme to follow, a random conglomeration of passes and shots that go on for 90 minutes. Yet, what if there was a pattern to the chaos, or a way to describe the events that occur in the game quantifiably. Sports statistics is a critical part of baseball and a variety of other of today’s sports, but we see very little statistics and data analysis done on soccer. Of this research, there has been looks into the effect of possession time on the outcome of a game, the difference …
Developing Methods Of Processing And Analyzing Genetic Data To Examine Tiger Salamander Population Structure, Dennis Dongmin Kim
Developing Methods Of Processing And Analyzing Genetic Data To Examine Tiger Salamander Population Structure, Dennis Dongmin Kim
Undergraduate Research Symposium 2018
Professor Heather Waye and her colleagues conducted a pilot study in 2014 to measure genetic diversity and dispersal pattern in a population of tiger salamanders in west-central Minnesota. The ultimate goal of this research was to analyze the genetic differences between tiger salamander larvae captured in breeding ponds within Pepperton Waterfowl Production Area to understand the population structure and movement patterns. They expected that ponds closer to each other would have more similar genetic information, and that genetic differences between ponds would increase with geographic distance. However, the initial analysis using standard techniques failed to uncover useful patterns in the …
Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia
Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia
Statistical Science Theses and Dissertations
This research contains two topics: (1) PBNPA: a permutation-based non-parametric analysis of CRISPR screen data; (2) RCRnorm: an integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data from FFPE samples.
Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single spe- cific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms. We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level …
Sabermetrics - Statistical Modeling Of Run Creation And Prevention In Baseball, Parker Chernoff
Sabermetrics - Statistical Modeling Of Run Creation And Prevention In Baseball, Parker Chernoff
FIU Electronic Theses and Dissertations
The focus of this thesis was to investigate which baseball metrics are most conducive to run creation and prevention. Stepwise regression and Liu estimation were used to formulate two models for the dependent variables and also used for cross validation. Finally, the predicted values were fed into the Pythagorean Expectation formula to predict a team’s most important goal: winning.
Each model fit strongly and collinearity amongst offensive predictors was considered using variance inflation factors. Hits, walks, and home runs allowed, infield putouts, errors, defense-independent earned run average ratio, defensive efficiency ratio, saves, runners left on base, shutouts, and walks per …
On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar
On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar
FIU Electronic Theses and Dissertations
Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo …
Modeling Multimodal Failure Effects Of Complex Systems Using Polyweibull Distribution, Daniel A. Timme
Modeling Multimodal Failure Effects Of Complex Systems Using Polyweibull Distribution, Daniel A. Timme
Theses and Dissertations
The Department of Defense (DoD) enlists multiple complex systems across each of their departments. Between the aging systems going through an overhaul and emerging new systems, quality assurance to complete the mission and secure the nation‘s objectives is an absolute necessity. The U.S. Air Force‘s increased interest in Remotely Piloted Aircraft (RPA) and the Space Warfighting domain are current examples of complex systems that must maintain high reliability and sustainability in order to complete missions moving forward. DoD systems continue to grow in complexity with an increasing number of components and parts in more complex arrangements. Bathtub-shaped hazard functions arise …
Essentials Of Structural Equation Modeling, Mustafa Emre Civelek
Essentials Of Structural Equation Modeling, Mustafa Emre Civelek
Zea E-Books Collection
Structural Equation Modeling is a statistical method increasingly used in scientific studies in the fields of Social Sciences. It is currently a preferred analysis method, especially in doctoral dissertations and academic researches. However, since many universities do not include this method in the curriculum of undergraduate and graduate courses, students and scholars try to solve the problems they encounter by using various books and internet resources.
This book aims to guide the researcher who wants to use this method in a way that is free from math expressions. It teaches the steps of a research program using structured equality modeling …