Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 30

Full-Text Articles in Physical Sciences and Mathematics

Code Syntax Understanding In Large Language Models, Cole Granger May 2024

Code Syntax Understanding In Large Language Models, Cole Granger

Undergraduate Honors Theses

In recent years, tasks for automated software engineering have been achieved using Large Language Models trained on source code, such as Seq2Seq, LSTM, GPT, T5, BART and BERT. The inherent textual nature of source code allows it to be represented as a sequence of sub-words (or tokens), drawing parallels to prior work in NLP. Although these models have shown promising results according to established metrics (e.g., BLEU, CODEBLEU), there remains a deeper question about the extent of syntax knowledge they truly grasp when trained and fine-tuned for specific tasks.

To address this question, this thesis introduces a taxonomy of syntax …


Security And Interpretability In Large Language Models, Lydia Danas May 2024

Security And Interpretability In Large Language Models, Lydia Danas

Undergraduate Honors Theses

Large Language Models (LLMs) have the capability to model long-term dependencies in sequences of tokens, and are consequently often utilized to generate text through language modeling. These capabilities are increasingly being used for code generation tasks; however, LLM-powered code generation tools such as GitHub's Copilot have been generating insecure code and thus pose a cybersecurity risk. To generate secure code we must first understand why LLMs are generating insecure code. This non-trivial task can be realized through interpretability methods, which investigate the hidden state of a neural network to explain model outputs. A new interpretability method is rationales, which obtains …


Roads And Corresponding Travel Time To Markets: Assessing Climate Vulnerability In Nepal, Kaitlyn Crowley May 2024

Roads And Corresponding Travel Time To Markets: Assessing Climate Vulnerability In Nepal, Kaitlyn Crowley

Undergraduate Honors Theses

Roads exist as a physical and theoretical connection between people and places around the globe. In addition to providing a route from one point to another, roads are also an indicator of access to markets and of poverty. However, current road datasets, particularly the Global Roads Open Access Data Set, are out of date or incomplete, necessitating new sources of data for analyses involving road networks. This study explores the relationship between climate change and access to markets in Nepal. We seek to identify isolated communities that are likely to experience detrimental outcomes associated with environmental threats, such as increasing …


Transcriptional Dynamics During Rhodococcus Erythropolis Infection With Phage Wc1, Dana Willner, Sudip Paudel, Andrew D. Halleran, Grace E. Solini, Veronica Gray, Margaret Saha Apr 2024

Transcriptional Dynamics During Rhodococcus Erythropolis Infection With Phage Wc1, Dana Willner, Sudip Paudel, Andrew D. Halleran, Grace E. Solini, Veronica Gray, Margaret Saha

Arts & Sciences Articles

Background

Belonging to the Actinobacteria phylum, members of the Rhodococcus genus thrive in soil, water, and even intracellularly. While most species are non-pathogenic, several cause respiratory disease in animals and, more rarely, in humans. Over 100 phages that infect Rhodococcus species have been isolated but despite their importance for Rhodococcus ecology and biotechnology applications, little is known regarding the molecular genetic interactions between phage and host during infection. To address this need, we report RNA-Seq analysis of a novel Rhodococcus erythopolis phage, WC1, analyzing both the phage and host transcriptome at various stages throughout the infection process.

Results

By five …


Artificial Intelligence For The Electron Ion Collider (Ai4eic), C. Allaire, ..., Cristiano Fanelli, James Giroux, Joey Niestroy, Justin R. Stevens, Patrick Stone, L. Suarez, K. Suresh, Eric Walter, Et Al. Feb 2024

Artificial Intelligence For The Electron Ion Collider (Ai4eic), C. Allaire, ..., Cristiano Fanelli, James Giroux, Joey Niestroy, Justin R. Stevens, Patrick Stone, L. Suarez, K. Suresh, Eric Walter, Et Al.

Arts & Sciences Articles

The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. …


Eluquant: Event-Level Uncertainty Quantification In Deep Inelastic Scattering, Cristiano Fanelli, James Giroux Jan 2024

Eluquant: Event-Level Uncertainty Quantification In Deep Inelastic Scattering, Cristiano Fanelli, James Giroux

Arts & Sciences Articles

We introduce a physics-informed Bayesian neural network with flow-approximated posteriors using multiplicative normalizing flows for detailed uncertainty quantification (UQ) at the physics event-level. Our method is capable of identifying both heteroskedastic aleatoric and epistemic uncertainties, providing granular physical insights. Applied to deep inelastic scattering (DIS) events, our model effectively extracts the kinematic variables x, Q2, and y, matching the performance of recent deep learning regression techniques but with the critical enhancement of event-level UQ. This detailed description of the underlying uncertainty proves invaluable for decision-making, especially in tasks like event filtering. It also allows for the reduction of true inaccuracies …


Parameter Estimation For Patient Enrollment In Clinical Trials, Junyan Liu Dec 2023

Parameter Estimation For Patient Enrollment In Clinical Trials, Junyan Liu

Undergraduate Honors Theses

In this paper, we study the Poisson-gamma model for recruitment time in clinical trials. We proved several properties of this model that match our intuitions from a reliability perspective, did simulations on this model, and used different optimization methods to estimate the parameters. Although the behaviors of the optimization methods were unfavorable and unstable, we identified certain conditions and provided potential explanations for this phenomenon and further insights into the Poisson-gamma model.


A Language Framework For Modeling Social Media Account Behavior, Alexander C. Nwala, Alessandro Flammini, Filippo Menczer Aug 2023

A Language Framework For Modeling Social Media Account Behavior, Alexander C. Nwala, Alessandro Flammini, Filippo Menczer

Arts & Sciences Articles

Malicious actors exploit social media to inflate stock prices, sway elections, spread misinformation, and sow discord. To these ends, they employ tactics that include the use of inauthentic accounts and campaigns. Methods to detect these abuses currently rely on features specifically designed to target suspicious behaviors. However, the effectiveness of these methods decays as malicious behaviors evolve. To address this challenge, we propose a language framework for modeling social media account behaviors. Words in this framework, called BLOC, consist of symbols drawn from distinct alphabets representing user actions and content. Languages from the framework are highly flexible and can be …


Seeing What We Can't: Evaluating Implicit Biases In Deep Learning Satellite Imagery Models Trained For Poverty Prediction, Joseph O'Brien May 2023

Seeing What We Can't: Evaluating Implicit Biases In Deep Learning Satellite Imagery Models Trained For Poverty Prediction, Joseph O'Brien

Undergraduate Honors Theses

Previous studies have sought to use Convolutional Neural Networks for regional estimation of poverty levels. However, there is limited research into possible implicit biases in deep neural networks in the context of satellite imagery. In this work, we develop a deep learning model to predict the tertile of per-capita asset consumption, trained on satellite imagery and World Bank Living Standards Measurements Study data. Using satellite imagery collected via survey location data as inputs, we use transfer learning to train a VGG-16 Convolutional Neural Network to classify images based on per-capita consumption. The model achieves an $R^2$ of .74, using thousands …


A Satellite Imagery Approach To Estimating Migratory Flows In Guatemala Using Convolutional Neural Networks, Sarah Larimer May 2023

A Satellite Imagery Approach To Estimating Migratory Flows In Guatemala Using Convolutional Neural Networks, Sarah Larimer

Undergraduate Honors Theses

Being able to predict migratory flows is important in ensuring political, social, and economic stability. In the wake of violence, unrest, natural disasters, and social pressures, millions of mi- grants have fled Central America in search of a better life. However, due to the infrequent nature and high cost of census data, there is a need for a more remote and up to date approaches. Con- volutional Neural Networks offer a computer vision based approach that is cheaper and with significantly less lag. In this study, we seek to evaluate the effectiveness of different convolu- tional neural networks in predicting …


Identifying Social Media Users That Are Susceptible To Phishing Attacks, Zoe Metzger May 2023

Identifying Social Media Users That Are Susceptible To Phishing Attacks, Zoe Metzger

Undergraduate Honors Theses

Phishing scams are a billion-dollar problem. According to Threatpost, in 2020, business email compromise phishing attacks cost the US economy $ 1.8 billion. Social media phishing scams are also on the rise with 74% of companies experiencing social media attacks in 2021 according to Proofpoint. Educating users about phishing scams is an effective strategy for reducing phishing attacks. Despite efforts to combat phishing, the number of attacks continues to rise, likely indicative of a reticence of users to change online behaviors. Existing research into predicting vulnerable social media users that are susceptible to phishing mostly focuses on content analysis of …


Considering The Accuracy Of Fiat Boundaries: Ontology And Quantification, Lydia Troup May 2023

Considering The Accuracy Of Fiat Boundaries: Ontology And Quantification, Lydia Troup

Undergraduate Honors Theses

Administrative boundaries - i.e., states, counties, or districts - are fiat boundaries; they exist purely as defined by human interpretation. Because of this, and despite their critical importance to government functions, the accuracy of data products claiming to represent such boundaries is difficult to measure. Here, I explore this topic using three boundary data sets: the open source geoBoundaries data set, the humanitarian UN OCHA’s Common Operational Datasets (COD), and Esri’s commercial administrative divisions 0 and 1 data sets in the Living Atlas. The accuracy of each was quantified as the percent overlap between each data set and an authoritative …


Predicting Micronutrient Deficiency With Publicly Available Satellite Data, Elizabeth Bondi-Kelly, Haipeng Chen, Christopher D. Golden, Nikhil Behari, Milind Tambe Mar 2023

Predicting Micronutrient Deficiency With Publicly Available Satellite Data, Elizabeth Bondi-Kelly, Haipeng Chen, Christopher D. Golden, Nikhil Behari, Milind Tambe

Arts & Sciences Articles

Micronutrient deficiency (MND), which is a form of malnutrition that can have serious health consequences, is difficult to diagnose in early stages without blood draws, which are expensive and time-consuming to collect and process. It is even more difficult at a public health scale seeking to identify regions at higher risk of MND. To provide data more widely and frequently, we propose an accurate, scalable, low-cost, and interpretable regional-level MND prediction system. Specifically, our work is the first to use satellite data, such as forest cover, weather, and presence of water, to predict deficiency of micronutrients such as iron, Vitamin …


'Flux+Mutability': A Conditional Generative Approach To One-Class Classification And Anomaly Detection, Cristiano Fanelli, James Giroux, Z. Papandreou Nov 2022

'Flux+Mutability': A Conditional Generative Approach To One-Class Classification And Anomaly Detection, Cristiano Fanelli, James Giroux, Z. Papandreou

Arts & Sciences Articles

Anomaly Detection is becoming increasingly popular within the experimental physics community. At experiments such as the Large Hadron Collider, anomaly detection is growing in interest for finding new physics beyond the Standard Model. This paper details the implementation of a novel Machine Learning architecture, called Flux+Mutability, which combines cutting-edge conditional generative models with clustering algorithms. In the 'flux' stage we learn the distribution of a reference class. The 'mutability' stage at inference addresses if data significantly deviates from the reference class. We demonstrate the validity of our approach and its connection to multiple problems spanning from one-class classification to anomaly …


Deep Learning Fusion Of Satellite And Social Information To Estimate Human Migratory Flows, Daniel Runfola, Heather Baier, Laura Mills, Maeve Naughton-Rockwell, Anthony Stefanidis Sep 2022

Deep Learning Fusion Of Satellite And Social Information To Estimate Human Migratory Flows, Daniel Runfola, Heather Baier, Laura Mills, Maeve Naughton-Rockwell, Anthony Stefanidis

Arts & Sciences Articles

Human migratory decisions are driven by a wide range of factors, including economic and environmental condi-tions, conflict, and evolving social dynamics. These factors are reflected in disparate data sources, including house-hold surveys, satellite imagery, and even news and social media. Here, we present a deep learning- based data fusion technique integrating satellite and census data to estimate migratory flows from Mexico to the United States. We leverage a three-stage approach, in which we (1) construct a matrix- based representation of socioeconomic information for each municipality in Mexico, (2) implement a convolutional neural network with both satellite imagery and the constructed …


Population Genetics Of Transposable Element Load: A Mechanistic Account Of Observed Overdispersion, Gregory Conradi Smith, Ron D. Smith, Joshua Puzey Jul 2022

Population Genetics Of Transposable Element Load: A Mechanistic Account Of Observed Overdispersion, Gregory Conradi Smith, Ron D. Smith, Joshua Puzey

Arts & Sciences Articles

In an empirical analysis of transposable element (TE) abundance within natural populations of Mimulus guttatus and Drosophila melanogaster, we found a surprisingly high variance of TE count (e.g., variance-to-mean ratio on the order of 10 to 300). To obtain insight regarding the evolutionary genetic mechanisms that underlie the overdispersed population distributions of TE abundance, we developed a mathematical model of TE population genetics that includes the dynamics of element proliferation and purifying selection on TE load. The modeling approach begins with a master equation for a birth-death process and extends the predictions of the classical theory of TE dynamics in …


Using Deep Learning With Satellite Imagery To Estimate Deforestation Rates, Maeve Naughton-Rockwell May 2022

Using Deep Learning With Satellite Imagery To Estimate Deforestation Rates, Maeve Naughton-Rockwell

Undergraduate Honors Theses

Previous studies have used Convolutional Neural Networks for regional detection of deforestation breaks. However, there is limited research into the capability of deep neural networks to identify sudden shifts in global forest cover from satellite imagery. Additionally, many deforestation detection models are trained on region specific data and need manual input thresholds. In this work, we develop a deep learning model to predict the percent of deforestation in a region between two points in time, trained on globally sourced data. Using the before and after satellite images of a deforestation event as inputs, we implemented a two input Convolutional Neural …


Using A Machine Learning Model To Predict Plant Inflorescences Based Upon Its Soil Microbiome, Luke Denoncourt May 2022

Using A Machine Learning Model To Predict Plant Inflorescences Based Upon Its Soil Microbiome, Luke Denoncourt

Undergraduate Honors Theses

The UN estimates that the global population could reach 9.7 billion by 2050 (United Nations). As a result, the amount of food required to feed humanity is thought to double by 2050 (Ray et al., 2012). Humanity must find a way to increase crop production without increasing fertilizer usage and eutrophication, which can be done using the soil microbiome. Using potted plants with soils inoculated with Pseudomonas alcaligenes, Pseudomonas denitrificans, Bacillus polymyxa, and Mycobacterium phlei, both the shoot and root growth of pea and cotton plants was significantly increased (Egamberdieva & Höflich, 2004). In this study, utilizing a random forest …


The Pandemic From Above: Estimating Covid-19 Cases Using Deep Learning And Satellite Imagery, John Hennin Apr 2022

The Pandemic From Above: Estimating Covid-19 Cases Using Deep Learning And Satellite Imagery, John Hennin

Undergraduate Honors Theses

Monitoring the spread of an outbreak of disease (such as COVID-19) is an important component of any coordinated pandemic response. Across the globe, our ability to conduct such monitoring - especially at early stages of the COVID- 19 pandemic - was highly limited due to a lack of public reporting mechanisms. Today, the process of case data collection remains expensive and, in some regions, is subject to political considerations. Researchers have turned to some techniques leveraging Google Trends and Twitter data to overcome limitations in public data sources. Here, we provide another approach which leverages satellite information to provide estimates …


Machine Learning In Healthcare: Improving The Diagnosis Of Pulmonary Embolism In Covid-19 Patients, Soheb Osmani Apr 2022

Machine Learning In Healthcare: Improving The Diagnosis Of Pulmonary Embolism In Covid-19 Patients, Soheb Osmani

Undergraduate Honors Theses

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has created new challenges for clinicians diagnosing pulmonary embolism (PE). Clinicians currently rely on D-Dimer levels in conjunction with clinical prediction scores to rule out and diagnose PE. However, patients with COVID-19 (the disease caused by SARS-CoV-2) often present with elevated D-Dimer levels. D-Dimer levels in COVID-19 patients have been found to be positively correlated with the severity of disease. Symptoms of COVID-19 also often align with symptoms of PE. Therefore, it becomes more difficult for clinicians to identify which COVID-19 positive patients should undergo further testing for PE. This study evaluates …


Postpandemic Outlook For Organized Criminal Activities: Agility Across The Physical, Social, And Cyber Spaces, Jim Jones, Anthony Stefanidis Jan 2022

Postpandemic Outlook For Organized Criminal Activities: Agility Across The Physical, Social, And Cyber Spaces, Jim Jones, Anthony Stefanidis

Arts & Sciences Book Chapters

The global COVID-19 pandemic and response affected every aspect of our society, including the activities of criminal organizations. In this chapter, we discuss several examples of criminal organization agility during the pandemic, drawn from the physical, social, and cyber domains. We assess that criminal organizations are emerging from the pandemic stronger than before, the pandemic presents a unique opportunity to study criminal organization agility, and criminal organizations are more exposed after their pandemic-driven adjustments. We also assess that this adjusted criminal activity and other factors, including risky operations that expose discoverable data, create investigative opportunities that will enable a deeper …


Toponym-Assisted Map Georeferencing: Evaluating The Use Of Toponyms For The Digitization Of Map Collections, Karim Bahgat, Daniel Runfola Nov 2021

Toponym-Assisted Map Georeferencing: Evaluating The Use Of Toponyms For The Digitization Of Map Collections, Karim Bahgat, Daniel Runfola

Arts & Sciences Articles

A great deal of information is contained within archival maps—ranging from historic political boundaries, to mineral resources, to the locations of cultural landmarks. There are many ongoing efforts to preserve and digitize historic maps so that the information contained within them can be stored and analyzed efficiently. A major barrier to such map digitizing efforts is that the geographic location of each map is typically unknown and must be determined through an often slow and manual process known as georeferencing. To mitigate the time costs associated with the georeferencing process, this paper introduces a fully automated method based on map …


Predicting Road Quality Using High Resolution Satellite Imagery: A Transfer Learning Approach, Ethan Brewer, Jason Lin, Peter Kemper, John Hennin, Daniel Runfola Jul 2021

Predicting Road Quality Using High Resolution Satellite Imagery: A Transfer Learning Approach, Ethan Brewer, Jason Lin, Peter Kemper, John Hennin, Daniel Runfola

Arts & Sciences Articles

Recognizing the importance of road infrastructure to promote human health and economic development, actors around the globe are regularly investing in both new roads and road improvements. However, in many contexts there is a sparsity—or complete lack—of accurate information regarding existing road infrastructure, challenging the effective identification of where investments should be made. Previous literature has focused on overcoming this gap through the use of satellite imagery to detect and map roads. In this piece, we extend this literature by leveraging satellite imagery to estimate road quality and concomitant information about travel speed. We adopt a transfer learning approach in …


Molecular Cluster Fragment Machine Learning Training Techniques To Predict Energetics Of Brown Carbon Aerosol Clusters, Emily E. Chappie May 2021

Molecular Cluster Fragment Machine Learning Training Techniques To Predict Energetics Of Brown Carbon Aerosol Clusters, Emily E. Chappie

Undergraduate Honors Theses

Density functional theory (DFT) has become a popular method for computational work involving larger molecular systems as it provides accuracy that rivals ab initio methods while lowering computational cost. Nevertheless, computational cost is still high for systems greater than ten atoms in size, preventing their application in modeling realistic atmospheric systems at the molecular level. Machine learning techniques, however, show promise as cost-effective tools in predicting chemical properties when properly trained. In the interest of furthering chemical machine learning in the field of atmospheric science, I have developed a training method for predicting cluster energetics of newly characterized nitrogen-based brown …


Scope: Building And Testing An Integrated Manual-Automated Event Extraction Tool For Online Text-Based Media Sources, Matthew Crittenden May 2021

Scope: Building And Testing An Integrated Manual-Automated Event Extraction Tool For Online Text-Based Media Sources, Matthew Crittenden

Undergraduate Honors Theses

Building on insights from two years of manually extracting events information from online news media, an interactive information extraction environment (IIEE) was developed. SCOPE, the Scientific Collection of Open-source Policy Evidence, is a Python Django-based tool divided across specialized modules for extracting structured events data from unstructured text. These modules are grouped into a flexible framework which enables the user to tailor the tool to meet their needs. Following principles of user-oriented learning for information extraction (IE), SCOPE offers an alternative approach to developing AI-assisted IE systems. In this piece, we detail the ongoing development of the SCOPE tool, present …


A Convolutional Neural Network Approach To Predict Non-Permissive Environments From Moderate-Resolution Imagery, Seth Goodman, Ariel Benyishay, Daniel Runfola Jul 2020

A Convolutional Neural Network Approach To Predict Non-Permissive Environments From Moderate-Resolution Imagery, Seth Goodman, Ariel Benyishay, Daniel Runfola

Arts & Sciences Articles

Convolutional neural networks (CNNs) trained with satellite imagery have been successfully used to generate measures of development indicators, such as poverty, in developing nations. This article explores a CNN-based approach leveraging Landsat 8 imagery to predict locations of conflict-related deaths. Using Nigeria as a case study, we use the Armed Conflict Location & Event Data (ACLED) dataset to identify locations of conflict events that did or did not result in a death. Imagery for each location is used as an input to train a CNN to distinguish fatal from non-fatal events. Using 2014 imagery, we are able to predict the …


Crowdsourcing Street View Imagery: A Comparison Of Mapillary And Openstreetcam, Ron Mahabir, Ross Schuchard, Andrew Crooks, Arie Croitoru, Anthony Stefanidis May 2020

Crowdsourcing Street View Imagery: A Comparison Of Mapillary And Openstreetcam, Ron Mahabir, Ross Schuchard, Andrew Crooks, Arie Croitoru, Anthony Stefanidis

Arts & Sciences Articles

Over the last decade, Volunteered Geographic Information (VGI) has emerged as a viable source of information on cities. During this time, the nature of VGI has been evolving, with new types and sources of data continually being added. In light of this trend, this paper explores one such type of VGI data: Volunteered Street View Imagery (VSVI). Two VSVI sources, Mapillary and OpenStreetCam, were extracted and analyzed to study road coverage and contribution patterns for four US metropolitan areas. Results show that coverage patterns vary across sites, with most contributions occurring along local roads and in populated areas. We also …


Geoboundaries: A Global Database Of Political Administrative Boundaries, Daniel Runfola, Austin Anderson, Heather Baier, Matt Crittenden, Elizabeth Dowker, Seth Goodman, Et Al. Apr 2020

Geoboundaries: A Global Database Of Political Administrative Boundaries, Daniel Runfola, Austin Anderson, Heather Baier, Matt Crittenden, Elizabeth Dowker, Seth Goodman, Et Al.

Arts & Sciences Articles

We present the geoBoundaries Global Administrative Database (geoBoundaries): an online, open license resource of the geographic boundaries of political administrative divisions (i.e., state, county). Contrasted to other resources geoBoundaries (1) provides detailed information on the legal open license for every boundary in the repository, and (2) focuses on provisioning highly precise boundary data to support accurate, replicable scientific inquiry. Further, all data is released in a structured form, allowing for the integration of geoBoundaries with large-scale computational workflows. Our database has records for every country around the world, with up to 5 levels of administrative hierarchy. The database is accessible …


Gis & Middle Earth, Robert A. Rose Jan 2020

Gis & Middle Earth, Robert A. Rose

Arts & Sciences Open Educational Resources

Did Frodo take the best path to destroy the One Ring?

With the right spatial data layers and utilizing the power of a geographic information system, a least cost path analysis could reveal whether there was a better route that Frodo could have taken from the Shire to Mount Doom to destroy the ring of power.

After several years of development, the Center for Geospatial Analysis at William & Mary, has developed an extensive list of GIS layers of Middle Earth, including a 50 meter elevation model, roads, rivers, realms and many others. These data formed the basis for an …


Generating A Close-To-Reality Synthetic Population Of Ghana, Tyler Frazier, Andreas Alfons Jan 2012

Generating A Close-To-Reality Synthetic Population Of Ghana, Tyler Frazier, Andreas Alfons

Arts & Sciences Articles

The purpose of this research is to generate a close-to-reality synthetic human population for use in a geosimulation of urban dynamics. Two commonly accepted approaches to generating synthetic human populations are Iterative Proportional Fitting (IPF) and Resampling with Replacement. While these methods are effective at reproducing one instance of the probability model describing the survey, it is an instance with extremely small variability amongst subgroups and is very unlikely to be the real population. IPF and Resampling with Replacement also rely on pure replication of units from the underlying sample which can increase unrealistic model behavior. In this work we …