Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Southern Methodist University (7)
- California Polytechnic State University, San Luis Obispo (5)
- Florida International University (2)
- Misericordia University (2)
- The University of Akron (2)
-
- University of Kentucky (2)
- Chapman University (1)
- Claremont Colleges (1)
- East Tennessee State University (1)
- Georgia Southern University (1)
- Munster Technological University (1)
- Murray State University (1)
- University of Lynchburg (1)
- University of Nebraska - Lincoln (1)
- University of New Hampshire (1)
- University of Southern Maine (1)
- University of Washington Tacoma (1)
- Virginia Commonwealth University (1)
- Washington University in St. Louis (1)
- Western Kentucky University (1)
- Publication Year
- Publication
-
- Statistical Science Theses and Dissertations (6)
- Statistics (4)
- Electronic Theses and Dissertations (2)
- FIU Electronic Theses and Dissertations (2)
- Student Research Poster Presentations 2020 (2)
-
- Williams Honors College, Honors Research Projects (2)
- Access*: Interdisciplinary Journal of Student Research and Scholarship (1)
- CMC Senior Theses (1)
- Department of Mathematics Publications (1)
- Doctor of Business Administration Dissertations (1)
- Honors Theses and Capstones (1)
- Maine Collection (1)
- Master's Theses (1)
- Masters Theses & Specialist Projects (1)
- Murray State Theses and Dissertations (1)
- SMU Data Science Review (1)
- Student Scholar Symposium Abstracts and Posters (1)
- Theses and Dissertations (1)
- Theses and Dissertations--Epidemiology and Biostatistics (1)
- Theses and Dissertations--Statistics (1)
- Undergraduate Theses and Capstone Projects (1)
- Zea E-Books Collection (1)
- Publication Type
Articles 1 - 30 of 34
Full-Text Articles in Statistical Models
Ensemble Classification: An Analysis Of The Random Forest Model, Jarod Korn
Ensemble Classification: An Analysis Of The Random Forest Model, Jarod Korn
Williams Honors College, Honors Research Projects
The random forest model proposed by Dr. Leo Breiman in 2001 is an ensemble machine learning method for classification prediction and regression. In the following paper, we will conduct an analysis on the random forest model with a focus on how the model works, how it is applied in software, and how it performs on a set of data. To fully understand the model, we will introduce the concept of decision trees, give a summary of the CART model, explain in detail how the random forest model operates, discuss how the model is implemented in software, demonstrate the model by …
Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre
Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre
SMU Data Science Review
Hair is found in over 90% of crime scenes and has long been analyzed as trace evidence. However, recent reviews of traditional hair fiber analysis techniques, primarily morphological examination, have cast doubt on its reliability. To address these concerns, this study employed machine learning algorithms, specifically Linear Discriminant Analysis (LDA) and Random Forest, on Direct Analysis in Real Time time-of-flight mass spectra collected from human, cat, and dog hair samples. The objective was to develop a chemistry- and statistics-based classification method for unbiased taxonomic identification of hair. The results of the study showed that LDA and Random Forest were highly …
Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang
Bayesian Statistical Modeling Of Spatially Resolved Transcriptomics Data, Xi Jiang
Statistical Science Theses and Dissertations
Spatially resolved transcriptomics (SRT) quantifies expression levels at different spatial locations, providing a new and powerful tool to investigate novel biological insights. As experimental technologies enhance both in capacity and efficiency, there arises a growing demand for the development of analytical methodologies.
One question in SRT data analysis is to identify genes whose expressions exhibit spatially correlated patterns, called spatially variable (SV) genes. Most current methods to identify SV genes are built upon the geostatistical model with Gaussian process, which could limit the models' ability to identify complex spatial patterns. In order to overcome this challenge and capture more types …
A Comparison Of Confidence Intervals In State Space Models, Jinyu Du
A Comparison Of Confidence Intervals In State Space Models, Jinyu Du
Statistical Science Theses and Dissertations
This thesis develops general procedures for constructing confidence intervals (CIs) of the error disturbance parameters (standard deviations) and transformations of the error disturbance parameters in time-invariant state space models (ssm). With only a set of observations, estimating individual error disturbance parameters accurately in the presence of other unknown parameters in ssm is a very challenging problem. We attempted to construct four different types of confidence intervals, Wald, likelihood ratio, score, and higher-order asymptotic intervals for both the simple local level model and the general time-invariant state space models (ssm). We show that for a simple local level model, both the …
Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile
Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile
Statistical Science Theses and Dissertations
Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …
Dynamic Prediction For Alternating Recurrent Events Using A Semiparametric Joint Frailty Model, Jaehyeon Yun
Dynamic Prediction For Alternating Recurrent Events Using A Semiparametric Joint Frailty Model, Jaehyeon Yun
Statistical Science Theses and Dissertations
Alternating recurrent events data arise commonly in health research; examples include hospital admissions and discharges of diabetes patients; exacerbations and remissions of chronic bronchitis; and quitting and restarting smoking. Recent work has involved formulating and estimating joint models for the recurrent event times considering non-negligible event durations. However, prediction models for transition between recurrent events are lacking. We consider the development and evaluation of methods for predicting future events within these models. Specifically, we propose a tool for dynamically predicting transition between alternating recurrent events in real time. Under a flexible joint frailty model, we derive the predictive probability of …
A Monte Carlo Analysis Of Seven Dichotomous Variable Confidence Interval Equations, Morgan Juanita Dubose
A Monte Carlo Analysis Of Seven Dichotomous Variable Confidence Interval Equations, Morgan Juanita Dubose
Masters Theses & Specialist Projects
Department of Psychological Sciences Western Kentucky University There are two options to estimate a range of likely values for the population mean of a continuous variable: one for when the population standard deviation is known and another for when the population standard deviation is unknown. There are seven proposed equations to calculate the confidence interval for the population mean of a dichotomous variable: normal approximation interval, Wilson interval, Jeffreys interval, Clopper-Pearson, Agresti-Coull, arcsine transformation, and logit transformation. In this study, I compared the percent effectiveness of each equation using a Monte Carlo analysis and the interval range over a range …
Statistical Theory For Specialized Linear Regression Adjustment Methods Compared To Multiple Linear Regression In The Presence And Absence Of Interaction Effects, Leon Su
Theses and Dissertations--Statistics
When building models to investigate outcomes and variables of interest, researchers often want to adjust for other variables. There is a variety of ways that these adjustments are performed. In this work, we will consider four approaches to adjustment utilized by researchers in various fields. We will compare the efficacy of these methods to what we call the ”true model method”, fitting a multiple linear regression model in which adjustment variables are model covariates. Our goal is to show that these adjustment methods have inferior performance to the true model method by comparing model parameter estimates, power, type I error, …
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Honors Theses and Capstones
COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Access*: Interdisciplinary Journal of Student Research and Scholarship
The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …
Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford
Power Analysis On A Pilot Study Of The Caloric Intake Of Children Helping Prepare Meals Versus Children Not, Danielle Clifford
Student Research Poster Presentations 2020
The purpose of this analysis is to determine the sample size needed for a study that will be used to discover if there is a difference in the caloric intake of children who help with meal preparation and children who do not help with meal preparation.
Predicting Diabetes Diagnoses, Sarah Netchert
Predicting Diabetes Diagnoses, Sarah Netchert
Student Research Poster Presentations 2020
This study explored the traits and health state of African Americans in central Virginia in order to determine what traits put people at a higher probability of being diagnosed with diabetes. We also want to know which traits will generate the highest probability a person will be diagnosed with diabetes. Traits that were included and used in this study were cholesterol, stabilized glucose, high density lipoprotein levels, age(years), gender, height(inches), weight(pounds), systolic blood pressure, diastolic blood pressure, waist size(inches), and hip size(inches). There were 403 individuals included in study since they were only ones screened for diabetes out of 1,046 …
Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane
Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane
Statistical Science Theses and Dissertations
If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?
We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …
Bayesian Hierarchical Meta-Analysis Of Asymptomatic Ebola Seroprevalence, Peter Brody-Moore
Bayesian Hierarchical Meta-Analysis Of Asymptomatic Ebola Seroprevalence, Peter Brody-Moore
CMC Senior Theses
The continued study of asymptomatic Ebolavirus infection is necessary to develop a more complete understanding of Ebola transmission dynamics. This paper conducts a meta-analysis of eight studies that measure seroprevalence (the number of subjects that test positive for anti-Ebolavirus antibodies in their blood) in subjects with household exposure or known case-contact with Ebola, but that have shown no symptoms. In our two random effects Bayesian hierarchical models, we find estimated seroprevalences of 8.76% and 9.72%, significantly higher than the 3.3% found by a previous meta-analysis of these eight studies. We also produce a variation of this meta-analysis where we exclude …
Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell
Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell
Undergraduate Theses and Capstone Projects
To the outside observer, soccer is chaotic with no given pattern or scheme to follow, a random conglomeration of passes and shots that go on for 90 minutes. Yet, what if there was a pattern to the chaos, or a way to describe the events that occur in the game quantifiably. Sports statistics is a critical part of baseball and a variety of other of today’s sports, but we see very little statistics and data analysis done on soccer. Of this research, there has been looks into the effect of possession time on the outcome of a game, the difference …
Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia
Developing Statistical Methods For Data From Platforms Measuring Gene Expression, Gaoxiang Jia
Statistical Science Theses and Dissertations
This research contains two topics: (1) PBNPA: a permutation-based non-parametric analysis of CRISPR screen data; (2) RCRnorm: an integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data from FFPE samples.
Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single spe- cific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms. We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level …
Sabermetrics - Statistical Modeling Of Run Creation And Prevention In Baseball, Parker Chernoff
Sabermetrics - Statistical Modeling Of Run Creation And Prevention In Baseball, Parker Chernoff
FIU Electronic Theses and Dissertations
The focus of this thesis was to investigate which baseball metrics are most conducive to run creation and prevention. Stepwise regression and Liu estimation were used to formulate two models for the dependent variables and also used for cross validation. Finally, the predicted values were fed into the Pythagorean Expectation formula to predict a team’s most important goal: winning.
Each model fit strongly and collinearity amongst offensive predictors was considered using variance inflation factors. Hits, walks, and home runs allowed, infield putouts, errors, defense-independent earned run average ratio, defensive efficiency ratio, saves, runners left on base, shutouts, and walks per …
On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar
On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar
FIU Electronic Theses and Dissertations
Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo …
Essentials Of Structural Equation Modeling, Mustafa Emre Civelek
Essentials Of Structural Equation Modeling, Mustafa Emre Civelek
Zea E-Books Collection
Structural Equation Modeling is a statistical method increasingly used in scientific studies in the fields of Social Sciences. It is currently a preferred analysis method, especially in doctoral dissertations and academic researches. However, since many universities do not include this method in the curriculum of undergraduate and graduate courses, students and scholars try to solve the problems they encounter by using various books and internet resources.
This book aims to guide the researcher who wants to use this method in a way that is free from math expressions. It teaches the steps of a research program using structured equality modeling …
Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons
Gilmore Girls And Instagram: A Statistical Look At The Popularity Of The Television Show Through The Lens Of An Instagram Page, Brittany Simmons
Student Scholar Symposium Abstracts and Posters
After going on the Warner Brothers Tour in December of 2015, I created a Gilmore Girls Instagram account. This account, which started off as a way for me to create edits of the show and post my photos from the tour turned into something bigger than I ever could have imagined. In just over a year I have over 55,000 followers. I post content including revival news, merchandise, and edits of the show that have been featured in Entertainment Weekly, Bustle, E! News, People Magazine, Yahoo News, & GilmoreNews.
I created a dataset of qualitative and quantitative outcomes from my …
Neural Network Predictions Of A Simulation-Based Statistical And Graph Theoretic Study Of The Board Game Risk, Jacob Munson
Neural Network Predictions Of A Simulation-Based Statistical And Graph Theoretic Study Of The Board Game Risk, Jacob Munson
Murray State Theses and Dissertations
We translate the RISK board into a graph which undergoes updates as the game advances. The dissection of the game into a network model in discrete time is a novel approach to examining RISK. A review of the existing statistical findings of skirmishes in RISK is provided. The graphical changes are accompanied by an examination of the statistical properties of RISK. The game is modeled as a discrete time dynamic network graph, with the various features of the game modeled as properties of the network at a given time. As the network is computationally intensive to implement, results are produced …
An Exploratory Statistical Method For Finding Interactions In A Large Dataset With An Application Toward Periodontal Diseases, Joshua Lambert
An Exploratory Statistical Method For Finding Interactions In A Large Dataset With An Application Toward Periodontal Diseases, Joshua Lambert
Theses and Dissertations--Epidemiology and Biostatistics
It is estimated that Periodontal Diseases effects up to 90% of the adult population. Given the complexity of the host environment, many factors contribute to expression of the disease. Age, Gender, Socioeconomic Status, Smoking Status, and Race/Ethnicity are all known risk factors, as well as a handful of known comorbidities. Certain vitamins and minerals have been shown to be protective for the disease, while some toxins and chemicals have been associated with an increased prevalence. The role of toxins, chemicals, vitamins, and minerals in relation to disease is believed to be complex and potentially modified by known risk factors. A …
Quasi-Random Action Selection In Markov Decision Processes, Samuel D. Walker
Quasi-Random Action Selection In Markov Decision Processes, Samuel D. Walker
Electronic Theses and Dissertations
In Markov decision processes an operator exploits known data regarding the environment it inhabits. The information exploited is learned from random exploration of the state-action space. This paper proposes to optimize exploration through the implementation of quasi-random sequences in both discrete and continuous state-action spaces. For the discrete case a permutation is applied to the indices of the action space to avoid repetitive behavior. In the continuous case sequences of low discrepancy, such as Halton sequences, are utilized to disperse the actions more uniformly.
A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz
A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz
Doctor of Business Administration Dissertations
At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with …
A Multi-Indexed Logistic Model For Time Series, Xiang Liu
A Multi-Indexed Logistic Model For Time Series, Xiang Liu
Electronic Theses and Dissertations
In this thesis, we explore a multi-indexed logistic regression (MILR) model, with particular emphasis given to its application to time series. MILR includes simple logistic regression (SLR) as a special case, and the hope is that it will in some instances also produce significantly better results. To motivate the development of MILR, we consider its application to the analysis of both simulated sine wave data and stock data. We looked at well-studied SLR and its application in the analysis of time series data. Using a more sophisticated representation of sequential data, we then detail the implementation of MILR. We compare …
Design & Analysis Of A Computer Experiment For An Aerospace Conformance Simulation Study, Ryan W. Gryder
Design & Analysis Of A Computer Experiment For An Aerospace Conformance Simulation Study, Ryan W. Gryder
Theses and Dissertations
Within NASA's Air Traffic Management Technology Demonstration # 1 (ATD-1), Interval Management (IM) is a flight deck tool that enables pilots to achieve or maintain a precise in-trail spacing behind a target aircraft. Previous research has shown that violations of aircraft spacing requirements can occur between an IM aircraft and its surrounding non-IM aircraft when it is following a target on a separate route. This research focused on the experimental design and analysis of a deterministic computer simulation which models our airspace configuration of interest. Using an original space-filling design and Gaussian process modeling, we found that aircraft delay assignments …
Black Cloud Randomization Test, Nicholas S. Vanni
Black Cloud Randomization Test, Nicholas S. Vanni
Williams Honors College, Honors Research Projects
The Black Cloud Randomization Test looks at a nontraditional question and attempts to answer the question using unique statistics. The purpose of this paper is to apply what has been learned throughout the years and apply this knowledge to a final project. Data for this project follows an emergency room’s on call schedule, as well as the number of traumas that came in during each day shift. The project builds on what has been already learned and helps to open a different way of working with statistics. The project was coded in the R software. With different restrictions, there are …
Preparedness Of Hospitals In The Republic Of Ireland For An Influenza Pandemic, An Infection Control Perspective, Mary Reidy, Fiona Ryan, Dervla Hogan, Seán Lacey, Claire Buckley
Preparedness Of Hospitals In The Republic Of Ireland For An Influenza Pandemic, An Infection Control Perspective, Mary Reidy, Fiona Ryan, Dervla Hogan, Seán Lacey, Claire Buckley
Department of Mathematics Publications
When an influenza pandemic occurs most of the population is susceptible and attack rates can range as high as 40–50 %. The most important failure in pandemic planning is the lack of standards or guidelines regarding what it means to be ‘prepared’. The aim of this study was to assess the preparedness of acute hospitals in the Republic of Ireland for an influenza pandemic from an infection control perspective.
Statistical Consulting - Senior Project, Cary Hernandez
Statistical Consulting - Senior Project, Cary Hernandez
Statistics
No abstract provided.
An Analysis Of Risk Reduction Choices In Dcis Breast Cancer Patients, Lauren Soltesz
An Analysis Of Risk Reduction Choices In Dcis Breast Cancer Patients, Lauren Soltesz
Statistics
The main focus of this paper was to evaluate possible demographic and clinical characteristics associated with a woman’s choice of breast conserving surgery (BCS), unilateral mastectomy (ULM), or bilateral risk reduction mastectomy (BRRM). The cohort consisted of patients presenting to the City of Hope National Medical Center with ductal carcinoma in situ breast cancer who elected to have cancer directed surgery (N=305). Analyses to examine associations of patient characteristics with type of surgery were conducted using a multinomial logistic regression. Results showed that older women were more likely to choose breast conserving surgery over bilateral risk reduction mastectomy than younger …