Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology Commons

Open Access. Powered by Scholars. Published by Universities.®

Data Science

PDF

2022

Institution
Keyword
Publication
Publication Type

Articles 1 - 11 of 11

Full-Text Articles in Statistical Methodology

Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss Dec 2022

Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss

All HCAS Student Capstones, Theses, and Dissertations

Trait-based ecology characterizes individuals’ functional attributes to better understand and predict their interactions with other species and their environments. Utilizing morphological traits to describe functional groups has helped group species with similar ecological niches that are not necessarily taxonomically related. Within the deep-pelagic fishes, the Order Stomiiformes exhibits high morphological and species diversity, and many species undertake diel vertical migration (DVM). While the morphology and behavior of stomiiform fishes have been extensively studied and described through taxonomic assessments, the connection between their form and function regarding their DVM types, morphotypes, and daytime depth distributions is not well known. Here, three …


Statistical Roles Of The G-Expectation Framework In Model Uncertainty: The Semi-G-Structure As A Stepping Stone, Yifan Li Oct 2022

Statistical Roles Of The G-Expectation Framework In Model Uncertainty: The Semi-G-Structure As A Stepping Stone, Yifan Li

Electronic Thesis and Dissertation Repository

The G-expectation framework is a generalization of the classical probability system based on the sublinear expectation to deal with phenomena that cannot be described by a single probabilistic model. These phenomena are closely related to the long-existing concern about model uncertainty in statistics. However, the distributions and independence in the G-framework are quite different from the classical setup. These distinctions bring difficulty when applying the idea of this framework to general statistical practice. Therefore, a fundamental and unavoidable problem is how to better understand G-version concepts from a statistical perspective.

To explore this problem, this thesis establishes a new substructure …


Statistical Extensions Of Multi-Task Learning With Semiparametric Methods And Task Diagnostics, Nikolay Miller Jun 2022

Statistical Extensions Of Multi-Task Learning With Semiparametric Methods And Task Diagnostics, Nikolay Miller

Mathematics & Statistics ETDs

In this dissertation, I propose new approaches to multi-task learning, inspired by statistical model diagnostics and semiparametric and additive modeling. The newly designed additive multi-task model framework allows for flexible estimation of multi-task parametric and nonparametric effects by using an extension of the backfitting algorithm. Further, I propose new methods for statistical task diagnostics, which allow for the identification and remedy of outlier tasks, based on task-specific performance metrics and their empirical distributions. I perform a deep examination of the well-established multi-task kernel method and achieve theoretical and experimental contributions. Lastly, I propose a two-step modeling approach to multi-task modeling, …


A Bayesian Programming Approach To Car-Following Model Calibration And Validation Using Limited Data, Franklin Abodo Jun 2022

A Bayesian Programming Approach To Car-Following Model Calibration And Validation Using Limited Data, Franklin Abodo

FIU Electronic Theses and Dissertations

Traffic simulation software is used by transportation researchers and engineers to design and evaluate changes to roadway networks. Underlying these simulators are mathematical models of microscopic driver behavior from which macroscopic measures of flow and congestion can be recovered. Many models are intended to apply to only a subset of possible traffic scenarios and roadway configurations, while others do not have any explicit constraint on their applicability. Work zones on highways are one scenario for which no model invented to date has been shown to accurately reproduce realistic driving behavior. This makes it difficult to optimize for safety and other …


Adjusting Community Survey Data Benchmarks For External Factors, Allen Miller, Nicole M. Norelli, Robert Slater, Mingyang N. Yu Jun 2022

Adjusting Community Survey Data Benchmarks For External Factors, Allen Miller, Nicole M. Norelli, Robert Slater, Mingyang N. Yu

SMU Data Science Review

Abstract. Using U.S. resident survey data from the National Community Survey in combination with public data from the U.S. Census and additional sources, a Voting Regressor Model was developed to establish fair benchmark values for city performance. These benchmarks were adjusted for characteristics the city cannot easily influence that contribute to confidence in local government, such as population size, demographics, and income. This adjustment allows for a more meaningful comparison and interpretation of survey results among individual cities. Methods explored for the benchmark adjustment included cluster analysis, anomaly detection, and a variety of regression techniques, including random forest, ridge, decision …


Attempting To Predict The Unpredictable: March Madness, Coleton Kanzmeier May 2022

Attempting To Predict The Unpredictable: March Madness, Coleton Kanzmeier

Theses/Capstones/Creative Projects

Each year, millions upon millions of individuals fill out at least one if not hundreds of March Madness brackets. People test their luck every year, whether for fun, with friends or family, or to even win some money. Some people rely on their basketball knowledge whereas others know it is called March Madness for a reason and take a shot in the dark. Others have even tried using statistics to give them an edge. I intend to follow a similar approach, using statistics to my advantage. The end goal is to predict this year’s, 2022, March Madness bracket. To achieve …


Physical Investigation Of Downburst Winds And Applicability To Full Scale Events, Federico Canepa Feb 2022

Physical Investigation Of Downburst Winds And Applicability To Full Scale Events, Federico Canepa

Electronic Thesis and Dissertation Repository

Thunderstorm winds, i.e. downbursts, are cold descending currents originating from cumulonimbus clouds which, upon the impingement on the ground, spread radially with high intensities. The downdraft phase of the storm and the subsequent radial outflow that is formed can cause major issues for aviation and immense damages to ground-mounted structures. Thunderstorm winds present characteristics completely different from the stationary Gaussian synoptic winds, which largely affect the mid-latitude areas of the globe in the form of extra-tropical cyclones. Downbursts are very localized winds in both space and time. It follows that their statistical investigation, by means of classical full scale anemometric …


A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo Jan 2022

A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo

Theses, Dissertations and Capstones

Cyberattack is a never-ending war that has greatly threatened secured information systems. The development of automated and intelligent systems provides more computing power to hackers to steal information, destroy data or system resources, and has raised global security issues. Statistical and Data mining tools have received continuous research and improvements. These tools have been adopted to create sophisticated intrusion detection systems that help information systems mitigate and defend against cyberattacks. However, the advancement in technology and accessibility of information makes more identifiable elements that can be used to gain unauthorized access to systems and resources. Data mining and classification tools …


Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler Jan 2022

Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler

Graduate Theses, Dissertations, and Problem Reports

This three-article dissertation focuses on cyber-related topics on terrorist groups, specifically Jihadists’ use of technology, the application of natural language processing, and social networks in analyzing text data derived from terrorists' Dark Web forums. The first article explores cybercrime and cyberterrorism. As technology progresses, it facilitates new forms of behavior, including tech-related crimes known as cybercrime and cyberterrorism. In this article, I provide an analysis of the problems of cybercrime and cyberterrorism within the field of criminology by reviewing existing literature focusing on (a) the issues in defining terrorism, cybercrime, and cyberterrorism, (b) ways that cybercriminals commit a crime in …


Analysis Of Minor League Rule Changes Effect On Stolen Bases, Zachary Houghtaling Jan 2022

Analysis Of Minor League Rule Changes Effect On Stolen Bases, Zachary Houghtaling

Williams Honors College, Honors Research Projects

This study uses various statistical analyses to evaluate the justification of rule changes for Major League Baseball that were implemented within the Minor Leagues during the 2021 minor league season. The primary focus of the study is predicting how some of these Minor League rule changes could affect the stolen base success rate and the number of attempts per game within the Major Leagues. A survey was conducted to evaluate how fans feel about stolen bases within the current game and if rules should be altered to increase the number of stolen bases that occur. Additionally, recorded Major and Minor …


Statistical Theory For Specialized Linear Regression Adjustment Methods Compared To Multiple Linear Regression In The Presence And Absence Of Interaction Effects, Leon Su Jan 2022

Statistical Theory For Specialized Linear Regression Adjustment Methods Compared To Multiple Linear Regression In The Presence And Absence Of Interaction Effects, Leon Su

Theses and Dissertations--Statistics

When building models to investigate outcomes and variables of interest, researchers often want to adjust for other variables. There is a variety of ways that these adjustments are performed. In this work, we will consider four approaches to adjustment utilized by researchers in various fields. We will compare the efficacy of these methods to what we call the ”true model method”, fitting a multiple linear regression model in which adjustment variables are model covariates. Our goal is to show that these adjustment methods have inferior performance to the true model method by comparing model parameter estimates, power, type I error, …