Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology Commons

Open Access. Powered by Scholars. Published by Universities.®

1,080 Full-Text Articles 1,470 Authors 549,762 Downloads 117 Institutions

All Articles in Statistical Methodology

Faceted Search

1,080 full-text articles. Page 1 of 36.

Statistical Methods To Generate Artificial Slot Floor Data For The Advancement Of Casino Related Research, Courtney Bonner, Anastasia (Stasi) D. Baran, Jason D. Fiege, Saman Muthukumarana 2023 nQube Data Science Inc.

Statistical Methods To Generate Artificial Slot Floor Data For The Advancement Of Casino Related Research, Courtney Bonner, Anastasia (Stasi) D. Baran, Jason D. Fiege, Saman Muthukumarana

International Conference on Gambling & Risk Taking

Abstract:

A common difficulty when researching gambling topics is the availability of high-quality data sets for development and testing. Due to the high level of secrecy within the gambling industry, if data is obtained for research purposes it is often prohibitively obfuscated, incomplete, or aggregated. Although these data have allowed for advancement in academic work, it leaves both the researchers and readers left wondering about what would be possible if more detailed data sets were available. To mitigate the paucity of data available to researchers, we present a Markov chain-based statistical process for producing artificial event data for a simulated …


Constrained Optimization Based Adversarial Example Generation For Transfer Attacks In Network Intrusion Detection Systems, Marc Chale, Bruce Cox, Jeffery Weir, Nathaniel D. Bastian 2023 Army Cyber Institute, U.S. Military Academy

Constrained Optimization Based Adversarial Example Generation For Transfer Attacks In Network Intrusion Detection Systems, Marc Chale, Bruce Cox, Jeffery Weir, Nathaniel D. Bastian

ACI Journal Articles

Deep learning has enabled network intrusion detection rates as high as 99.9% for malicious network packets without requiring feature engineering. Adversarial machine learning methods have been used to evade classifiers in the computer vision domain; however, existing methods do not translate well into the constrained cyber domain as they tend to produce non-functional network packets. This research views the payload of network packets as code with many functional units. A meta-heuristic based generative model is developed to maximize classification loss of packet payloads with respect to a surrogate model by repeatedly substituting units of code with functionally equivalent counterparts. The …


Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr 2023 Eastern Virginia Medical School

Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr

Biology and Medicine Through Mathematics Conference

No abstract provided.


Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile 2023 Southern Methodist University

Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile

Statistical Science Theses and Dissertations

Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …


Movie Recommender System Using Matrix Factorization, Roland Fiagbe 2023 University of Central Florida

Movie Recommender System Using Matrix Factorization, Roland Fiagbe

Data Science and Data Mining

Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …


Brief Review: Low Frequency Event Charts (G-Charts) In Healthcare, James Espinosa, David Ho, Alan Lucerna, Henry Schuitema 2023 Rowan University

Brief Review: Low Frequency Event Charts (G-Charts) In Healthcare, James Espinosa, David Ho, Alan Lucerna, Henry Schuitema

Stratford Campus Research Day

The ability to determine if a change in a system is actually an improvement—or worsening in function—is one of the essential desiderata of quality improvement efforts. There are many ways to look at the issue. A special problem occurs when the event being studied is low frequency by nature. By way of example, patient falls in a given hospital or division of a hospital may occur in a way that is low frequency—yet each event is important. Process engineering has developed an approach to low frequency events. Part of this approach may involve specialized charts that look at the “time-between-events”—as …


A Monte Carlo Analysis Of Nonprobability Sampling & Post Hoc Corrections, Julia Hong 2023 Western Kentucky University

A Monte Carlo Analysis Of Nonprobability Sampling & Post Hoc Corrections, Julia Hong

Masters Theses & Specialist Projects

Nonprobability samples are often used in place of probability samples because the former are less trouble and less expensive. Unfortunately, it is difficult to determine how well a sample represents population parameters when using nonprobability samples. Researchers attempt to mitigate the disadvantages of nonprobability sampling by performing post hoc corrections, but this adjustment may not successfully undo the effects of nonprobability sampling. To examine these effects, a Monte Carlo simulation was conducted to create a pseudo-population from which samples were drawn. Forty-one conditions were replicated 10,000 times each, with each sample consisting of 100 observations. A post-stratification adjustment was made …


Distance Correlation Based Feature Selection In Random Forest, Jose Munoz-Lopez 2023 California State University - San Bernardino

Distance Correlation Based Feature Selection In Random Forest, Jose Munoz-Lopez

Electronic Theses, Projects, and Dissertations

The Pearson correlation coefficient is a commonly used measure of correlation, but it has limitations as it only measures the linear relationship between two numerical variables. In 2007, Szekely et al. introduced the distance correlation, which measures all types of dependencies between random vectors X and Y in arbitrary dimensions, not just the linear ones. In this thesis, we propose a filter method that utilizes distance correlation as a criterion for feature selection in Random Forest regression. We conduct extensive simulation studies to evaluate its performance compared to existing methods under various data settings, in terms of the prediction mean …


Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash 2023 Kennesaw State University

Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash

Symposium of Student Scholars

Employee attrition is a relevant issue that every business employer must consider when gauging the effectiveness of their employees. Whether or not an employee chooses to leave their job can come from a multitude of factors. As a result, employers need to develop methods in which they can measure attrition by calculating the several qualities of their employees. Factors like their age, years with the company, which department they work in, their level of education, their job role, and even their marital status are all considered by employers to assist in predicting employee attrition. This project will be analyzing a …


Prevalence Of Sars-Cov-2 Antibodies In Liberty University Student Population, Emily Bonus 2023 Liberty University

Prevalence Of Sars-Cov-2 Antibodies In Liberty University Student Population, Emily Bonus

Senior Honors Theses

In 2020, the virus SARS-CoV-2 gained attention as it spread around the world. Its antibodies are poorly understood, and little research focuses on those with few COVID-19 complications yet large numbers of close contacts: university students. This longitudinal study recorded SARS-CoV-2 antibody presence in 107 undergraduate Liberty University students twice during early 2021. After extensive data cleaning and the application of various statistical tests and ANOVAs, the data seems to show that in the case of COVID-19 infections, SARS-CoV-2 IgM antibodies are immediately produced, and then IgG antibodies follow later. However, the COVID-19 vaccine causes the production of both IgM …


Influence Diagnostics For Generalized Estimating Equations Applied To Correlated Categorical Data, Louis Vazquez 2023 Southern Methodist University

Influence Diagnostics For Generalized Estimating Equations Applied To Correlated Categorical Data, Louis Vazquez

Statistical Science Theses and Dissertations

Influence diagnostics in regression analysis allow analysts to identify observations that have a strong influence on model fitted probabilities and parameter estimates. The most common influence diagnostics, such as Cook’s Distance for linear regression, are based on a deletion approach where the results of a model with and without observations of interest are compared. Here, deletion-based influence diagnostics are proposed for generalized estimating equations (GEE) for correlated, or clustered, nominal multinomial responses. The proposed influence diagnostics focus on GEEs with the baseline-category logit link function and a local odds ratio parameterization of the association structure. Formulas for both observation- and …


That’S My Deity: An Examination Of Online Lokean Cultures Through Log-Linear Modeling, Mary Bernstein 2023 University of South Carolina - Columbia

That’S My Deity: An Examination Of Online Lokean Cultures Through Log-Linear Modeling, Mary Bernstein

Senior Theses

A rise in online religious communities and the growth of so-called ‘Old World’ religions are reflected in the internet’s subcultures of Neopaganism, a growing religious movement that has been documented in America since the 1960s. The religions under this umbrella movement vary drastically and include belief systems such as Wicca, Druidry, and deity worship. Belief systems under this movement lack the traditional hierarchy found in structured religion and lack a singular sacred text. As such, believers usually find and support one another not through a physical sacred place of meeting, but through an online community that acts as sacred space. …


Unlocking Potential: The School-To-Prison Pipeline For Students With Disabilities, Navena F. Chaitoo 2023 The Graduate Center, City University of New York

Unlocking Potential: The School-To-Prison Pipeline For Students With Disabilities, Navena F. Chaitoo

Dissertations, Theses, and Capstone Projects

This research uses quasi-experimental, matched sampling to examine the school-to-prison pipeline for students with disabilities using data from the National Longitudinal Study of Adolescent to Adult Health. This study presents novel insights into an at-risk group that has faced disproportionate rates of school discipline and incarceration. The study finds school suspension to be associated with future involvement in the criminal legal system and lower educational attainment. Disability was not found to mediate the relationship between suspension and future involvement in the criminal legal system or the relationship between suspension and academic outcomes. However, disability was found to be a statistically …


Forecasting Remission Time Of A Treatment Method For Leukemia As An Application To Statistical Inference Approach, Mahmoud Mansour, Rashad EL-Sagheer, Ahmed Galal Attia, Beha S. El-Desouky Prof. 2023 Al-Azhar University - Egypt

Forecasting Remission Time Of A Treatment Method For Leukemia As An Application To Statistical Inference Approach, Mahmoud Mansour, Rashad El-Sagheer, Ahmed Galal Attia, Beha S. El-Desouky Prof.

Basic Science Engineering

In this paper, Weibull-Linear Exponential distribution (WLED) has been investigated whether being it is a well-fit distribution to a clinical real data. These data represent the duration of remission achieved by a certain drug used in the treatment of leukemia for a group of patients. The statistical inference approach is used to estimate the parameters of the WLED through the set of the fitted data. The estimated parameters are utilized to evaluate the survival and hazard functions and hence assessing the treatment method through forecasting the duration of remission times of patients. A two-sample prediction approach has been applied to …


Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth 2023 Rochester Institute of Technology

Modeling And Fitting Two-Way Tables Containing Outliers, David L. Farnsworth

Articles

A model is proposed for two-way tables of measurement data containing outliers. The two independent variables are categorical and error free. Neither missing values nor replication are present. The model consists of the sum of a customary additive part that can be fit using least squares and a part that is composed of outliers. Recommendations are made for methods for identifying cells containing outliers and for fitting the model. A graph of the observations is used to determine the outliers’ locations. For all cells containing an outlier, replacement values are determined simultaneously using a classical missing-data tool. The result is …


Biasing Estimator To Mitigate Multicollinearity In Linear Regression Model, Abdulrasheed Bello Badawaire, Issam Dawoud, Adewale Folaranmi Lukman, Victoria Laoye, Arowolo Olatunji 2023 Department of Mathematics and Statistics, Federal University Wukari, Wukari, Nigeria

Biasing Estimator To Mitigate Multicollinearity In Linear Regression Model, Abdulrasheed Bello Badawaire, Issam Dawoud, Adewale Folaranmi Lukman, Victoria Laoye, Arowolo Olatunji

Al-Bahir Journal for Engineering and Pure Sciences

A new two-parameter estimator was developed to combat the threat of multicollinearity for the linear regression model. Some necessary and sufficient conditions for the dominance of the proposed estimator over ordinary least squares (OLS) estimator, ridge regression estimator, Liu estimator, KL estimator, and some two-parameter estimators are obtained in the matrix mean square error sense. Theory and simulation results show that, under some conditions, the proposed two-parameter estimator consistently dominates other estimators considered in this study. The real-life application result follows suit.


Informative Hypothesis For Group Means Comparison, Dr. Teck Kiang Tan 2023 National University of Singapore

Informative Hypothesis For Group Means Comparison, Dr. Teck Kiang Tan

Practical Assessment, Research, and Evaluation

Researchers often have hypotheses concerning the state of affairs in the population from which they sampled their data to compare group means. The classical frequentist approach provides one way of carrying out hypothesis testing using ANOVA to state the null hypothesis that there is no difference in the means and proceed with multiple comparisons if the null hypothesis is rejected. As this approach is not able to incorporate order, inequality, and direction into hypothesis testing, and neither does it able to specify multiple hypotheses, this paper introduces the informative hypothesis that allows more flexibility in stating hypothesis testing and is …


Joint Probability Analysis Of Extreme Precipitation And Water Level For Chicago, Illinois, Anna Li Holey 2023 Michigan Technological University

Joint Probability Analysis Of Extreme Precipitation And Water Level For Chicago, Illinois, Anna Li Holey

Dissertations, Master's Theses and Master's Reports

A compound flooding event occurs when there is a combination of two or more extreme factors that happen simultaneously or in quick succession and can lead to flooding. In the Great Lakes region, it is common for a compound flooding event to occur with a high lake water level and heavy rainfall. With the potential of increasing water levels and an increase in precipitation under climate change, the Great Lakes coastal regions could be at risk for more frequent and severe flooding. The City of Chicago which is located on Lake Michigan has a high population and dense infrastructure and …


A Bootstrap Test For Informative Intra-Cluster Group Sizes In Clustered Data, Hasika K. Wickrama Senevirathne, Sandipan Dutta 2023 Old Dominion University

A Bootstrap Test For Informative Intra-Cluster Group Sizes In Clustered Data, Hasika K. Wickrama Senevirathne, Sandipan Dutta

College of Sciences Posters

Clustered data are frequently observed in various domains of scientific and social studies. In a typical clustered data, units within a cluster are correlated while units between different clusters are independent. An example of such clustered data can be found in dental studies where individuals are treated as clusters and the teeth in an individual are the units within a cluster. While analyzing such clustered data, it has been observed that the number of units present in a cluster can be informative in terms of being associated with the outcome from that cluster. Specifically, when the aim is to compare …


Classification Of Adult Income Using Decision Tree, Roland Fiagbe 2023 University of Central Florida

Classification Of Adult Income Using Decision Tree, Roland Fiagbe

Data Science and Data Mining

Decision tree is a commonly used data mining methodology for performing classification tasks. It is a tree-based supervised machine learning algorithm that is used to classify or make predictions in a path of how previous questions are answered. Generally, the decision tree algorithm categorizes data into branch-like segments that develop into a tree that contains a root, nodes, and leaves. This project seeks to explore the decision tree methodology and apply it to the Adult Income dataset from the UCI Machine Learning Repository, to determine whether a person makes over 50K per year and determine the necessary factors that improve …


Digital Commons powered by bepress