Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Statistics and Probability

Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre Dec 2023

Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre

SMU Data Science Review

Hair is found in over 90% of crime scenes and has long been analyzed as trace evidence. However, recent reviews of traditional hair fiber analysis techniques, primarily morphological examination, have cast doubt on its reliability. To address these concerns, this study employed machine learning algorithms, specifically Linear Discriminant Analysis (LDA) and Random Forest, on Direct Analysis in Real Time time-of-flight mass spectra collected from human, cat, and dog hair samples. The objective was to develop a chemistry- and statistics-based classification method for unbiased taxonomic identification of hair. The results of the study showed that LDA and Random Forest were highly …


Traditional Vs Machine Learning Approaches: A Comparison Of Time Series Modeling Methods, Miguel E. Bonilla Jr., Jason Mcdonald, Tamas Toth, Bivin Sadler Aug 2023

Traditional Vs Machine Learning Approaches: A Comparison Of Time Series Modeling Methods, Miguel E. Bonilla Jr., Jason Mcdonald, Tamas Toth, Bivin Sadler

SMU Data Science Review

In recent years, various new Machine Learning and Deep Learning algorithms have been introduced, claiming to offer better performance than traditional statistical approaches when forecasting time series. Studies seeking evidence to support the usage of ML/DL over statistical approaches have been limited to comparing the forecasting performance of univariate, linear time series data. This research compares the performance of traditional statistical-based and ML/DL methods for forecasting multivariate and nonlinear time series.


Evaluation Of The Effect Of The Clinical-Decision-Support Systems On Diabetes Management: A Multivariate Meta-Analysis Comparison With Univariate Meta-Analysis, Abdelfattah Elbarsha Jan 2021

Evaluation Of The Effect Of The Clinical-Decision-Support Systems On Diabetes Management: A Multivariate Meta-Analysis Comparison With Univariate Meta-Analysis, Abdelfattah Elbarsha

Electronic Theses and Dissertations

The advantage of using meta-analysis lies in its ability in providing a quantitative summary of the findings from multiple studies. The aim of this dissertation was first to conduct a simulation study in order to understand what factors (sample size, between-study correlation, and percent of missing data) have a significant effect on meta-analysis estimates and whether using univariate or multivariate meta-analysis would produce different estimates.

The second goal of this study was to evaluate the effect of clinical decision support systems CDSS on diabetes care management by conducting three separate univariate meta-analyses and one multivariate meta-analysis. CDSS are health information …


Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane Jan 2019

Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane

Statistical Science Theses and Dissertations

If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?

We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …


A 3d Characteristics Database Of Land Engraved Areas With Known Subclass, Entni Lin Jun 2018

A 3d Characteristics Database Of Land Engraved Areas With Known Subclass, Entni Lin

Student Theses

Subclass characteristics on bullets may mislead firearm examiners when they rely on traditional 2D images. In order to provide indelible examples for training and help avoid identification errors, 3D topography surface maps and statistical methods of pattern recognition are applied to toolmarks on bullets containing known subclass characteristics. This research was conducted by collecting 3D topography surface map data from land engraved areas of bullets fired through known barrels. This data was processed and used to train the statistical algorithms to predict their origin. The results from the algorithm are compared with the “right answers” (i.e. correct IDs) of the …


Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell May 2018

Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell

Undergraduate Theses and Capstone Projects

To the outside observer, soccer is chaotic with no given pattern or scheme to follow, a random conglomeration of passes and shots that go on for 90 minutes. Yet, what if there was a pattern to the chaos, or a way to describe the events that occur in the game quantifiably. Sports statistics is a critical part of baseball and a variety of other of today’s sports, but we see very little statistics and data analysis done on soccer. Of this research, there has been looks into the effect of possession time on the outcome of a game, the difference …


On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar Mar 2018

On The Performance Of Some Poisson Ridge Regression Estimators, Cynthia Zaldivar

FIU Electronic Theses and Dissertations

Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo …


Essentials Of Structural Equation Modeling, Mustafa Emre Civelek Mar 2018

Essentials Of Structural Equation Modeling, Mustafa Emre Civelek

Zea E-Books Collection

Structural Equation Modeling is a statistical method increasingly used in scientific studies in the fields of Social Sciences. It is currently a preferred analysis method, especially in doctoral dissertations and academic researches. However, since many universities do not include this method in the curriculum of undergraduate and graduate courses, students and scholars try to solve the problems they encounter by using various books and internet resources.

This book aims to guide the researcher who wants to use this method in a way that is free from math expressions. It teaches the steps of a research program using structured equality modeling …


A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz Dec 2016

A Traders Guide To The Predictive Universe- A Model For Predicting Oil Price Targets And Trading On Them, Jimmie Harold Lenz

Doctor of Business Administration Dissertations

At heart every trader loves volatility; this is where return on investment comes from, this is what drives the proverbial “positive alpha.” As a trader, understanding the probabilities related to the volatility of prices is key, however if you could also predict future prices with reliability the world would be your oyster. To this end, I have achieved three goals with this dissertation, to develop a model to predict future short term prices (direction and magnitude), to effectively test this by generating consistent profits utilizing a trading model developed for this purpose, and to write a paper that anyone with …


Geographic Disparities Associated With Stroke And Myocardial Infarction In East Tennessee, Ashley Pedigo Golden Dec 2011

Geographic Disparities Associated With Stroke And Myocardial Infarction In East Tennessee, Ashley Pedigo Golden

Doctoral Dissertations

Stroke and myocardial infarction (MI) are serious conditions whose burdens vary by socio-demographic and geographic factors. Although several studies have investigated and identified disparities in burdens of these conditions at the county and state levels, little is known regarding their geographic epidemiology at the neighborhood level. Both conditions require emergency treatments and therefore timely geographic accessibility to appropriate care is critical. Investigation of disparities in geographic accessibility to stroke and MI care and the role of Emergency Medical Services (EMS) in reducing treatment delays are vital in improving health outcomes. Therefore, the objectives of this work were to: (i) classify …


Energy Functional For Nuclear Masses, Michael Giovanni Bertolli Dec 2011

Energy Functional For Nuclear Masses, Michael Giovanni Bertolli

Doctoral Dissertations

An energy functional is formulated for mass calculations of nuclei across the nuclear chart with major-shell occupations as the relevant degrees of freedom. The functional is based on Hohenberg-Kohn theory. Motivation for its form comes from both phenomenology and relevant microscopic systems, such as the three-level Lipkin Model. A global fit of the 17-parameter functional to nuclear masses yields a root- mean-square deviation of χ[chi] = 1.31 MeV, on the order of other mass models. The construction of the energy functional includes the development of a systematic method for selecting and testing possible functional terms. Nuclear radii are computed within …


Statistical Analysis Of Fatalities Due To Vehicle Accidents In Las Vegas, Nv, Annabelle Marie Mathis Aug 2011

Statistical Analysis Of Fatalities Due To Vehicle Accidents In Las Vegas, Nv, Annabelle Marie Mathis

UNLV Theses, Dissertations, Professional Papers, and Capstones

The goal of this thesis is to investigate factors that affect the odds of having a fatality in a vehicle collision. We will be looking at characteristics of the driver that caused the accident (age, gender, behavior, actions, influences, and seat belt worn), the characteristics of the vehicle the driver drove (type of vehicle, and air bag deployment), the characteristics of the environment in which the accident occurred (weather, road condition, lighting, time of day, the day of the week, and month of the year), the characteristics of the crash (direction of accident and how many vehicles were involved), and …


Manova: Type I Error Rate Analysis, Kyle Wesley Gasperik Jun 2010

Manova: Type I Error Rate Analysis, Kyle Wesley Gasperik

Statistics

Multivariate analysis of variance (MANOVA) is most commonly used in the field of bio-statistics. Throughout this paper I conduct numerous simulations that help analyze how robust the MANOVA procedure is against its assumptions. Using Type I error rate as my measure of error, I used the R software to graph my results. The main assumption that is focused on is the equal covariance matrix assumption, which we introduce correlation between variables to see how well the MANOVA procedure performs. Overall, 70 simulations were ran, and 10 functions were created to perform all of the analysis.


Statistical Analysis Of Texas Holdem Poker, Daniel Bragonier Jun 2010

Statistical Analysis Of Texas Holdem Poker, Daniel Bragonier

Statistics

Gathered lifetime online Poker data for Mike Linn. Attempted to analyze data to obtain information to maximize profit. Techniques included Univariate Analysis, Regression analysis, Anova analysis, Logistic Regression, and outlier Analysis. After the analysis, nothing of supreme importance or sustenance was found. Encountered issues with too much power. Results lead to plenty of statistical significance, but little practical significance. Results showed that the data did not provide all the answers that were being sought after, but there was some value in examining the data in a strict statistical manner.