Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

12,628 Full-Text Articles 19,901 Authors 6,908,637 Downloads 286 Institutions

All Articles in Statistics and Probability

Faceted Search

12,628 full-text articles. Page 5 of 434.

Action Plan: Gym Cleanliness At The Jaeger Center, Blair A. O'Connor 2024 Gettysburg College

Action Plan: Gym Cleanliness At The Jaeger Center, Blair A. O'Connor

CAFE Symposium 2024

I have created an action plan to assess current patrons' satisfaction with the cleaning materials provided at the Gettysburg College Jaeger Center, and increase the amount or variety if the need is there. Due to a combination of behaviors and bacteria in the Jaeger Center, gym users are at risk of contracting infections. The objective of this plan is for gym users to feel more empowered and safe in their environment. While there may be individuals who feel like increased disinfecting efforts and supplies are not necessary at the Jaeger Center, what may not be a concern for one person …


Predicting Crop Yield Using Remote Sensing Data, Mary Row, Jung-Han Kimn, Hossein Moradi 2024 Saint Mary's University of Minnesota

Predicting Crop Yield Using Remote Sensing Data, Mary Row, Jung-Han Kimn, Hossein Moradi

SDSU Data Science Symposium

Accurate crop yield predictions can help farmers make adjustments or changes in their farming practices to optimize their harvest. Remote sensing data is an inexpensive approach to collecting massive amounts of data that could be utilized for predicting crop yield. This study employed linear regression and spatial linear models were used to predict soybean yield with data from Landsat 8 OLI. Each model was built using only spectral bands of the satellite, only vegetation indices, and both spectral bands and vegetation indices. All analysis was based on data collected from two fields in South Dakota from the 2019 and 2021 …


Principal Component Analysis With Application To Credit Card Data, Eleanor Cain, Semhar Michael, Gary Hatfield 2024 South Dakota State University

Principal Component Analysis With Application To Credit Card Data, Eleanor Cain, Semhar Michael, Gary Hatfield

SDSU Data Science Symposium

Principal Component Analysis (PCA) is a type of dimension reduction technique used in data analysis to process the data before making a model. In general, dimension reduction allows analysts to make conclusions about large data sets by reducing the number of variables while retaining as much information as possible. Using the numerical variables from a data set, PCA aims to compute a smaller set of uncorrelated variables, called principal components, that account for a majority of the variability from the data. The purpose of this poster is to understand PCA as well as perform PCA on a large sample credit …


Session 6: Model-Based Clustering Analysis On The Spatial-Temporal And Intensity Patterns Of Tornadoes, Yana Melnykov, Yingying Zhang, Rong Zheng 2024 University of Alabama - Tuscaloosa

Session 6: Model-Based Clustering Analysis On The Spatial-Temporal And Intensity Patterns Of Tornadoes, Yana Melnykov, Yingying Zhang, Rong Zheng

SDSU Data Science Symposium

Tornadoes are one of the nature’s most violent windstorms that can occur all over the world except Antarctica. Previous scientific efforts were spent on studying this nature hazard from facets such as: genesis, dynamics, detection, forecasting, warning, measuring, and assessing. While we want to model the tornado datasets by using modern sophisticated statistical and computational techniques. The goal of the paper is developing novel finite mixture models and performing clustering analysis on the spatial-temporal and intensity patterns of the tornadoes. To analyze the tornado dataset, we firstly try a Gaussian distribution with the mean vector and variance-covariance matrix represented as …


Session 6: The Size-Biased Lognormal Mixture With The Entropy Regularized Algorithm, Tatjana Miljkovic, Taehan Bae 2024 Miami University - Oxford

Session 6: The Size-Biased Lognormal Mixture With The Entropy Regularized Algorithm, Tatjana Miljkovic, Taehan Bae

SDSU Data Science Symposium

A size-biased left-truncated Lognormal (SB-ltLN) mixture is proposed as a robust alternative to the Erlang mixture for modeling left-truncated insurance losses with a heavy tail. The weak denseness property of the weighted Lognormal mixture is studied along with the tail behavior. Explicit analytical solutions are derived for moments and Tail Value at Risk based on the proposed model. An extension of the regularized expectation–maximization (REM) algorithm with Shannon's entropy weights (ewREM) is introduced for parameter estimation and variability assessment. The left-truncated internal fraud data set from the Operational Riskdata eXchange is used to illustrate applications of the proposed model. Finally, …


Making Sense Of Making Parole In New York, Alexandra McGlinchy 2024 The Graduate Center, City University of New York

Making Sense Of Making Parole In New York, Alexandra Mcglinchy

Dissertations, Theses, and Capstone Projects

For many individuals incarcerated in New York, the initial step toward freedom begins with an interview with the Board of Parole. This process, however, is frequently a complex and challenging one, characterized by repeated denials and extended incarcerations. The disparity in outcomes – where one individual may receive over 20 denials and another is granted parole on their first attempt – highlights the ambiguity and inconsistency in the parole decision-making process. This project aims to clarify the factors that influence parole decisions by concentrating on measurable variables. These include age, race, duration of sentence served, proportion of sentence served, type …


Modeling Of Covid-19 Clinical Outcomes In Mexico: An Analysis Of Demographic, Clinical, And Chronic Disease Factors, Livia Clarete 2024 The Graduate Center, City University of New York

Modeling Of Covid-19 Clinical Outcomes In Mexico: An Analysis Of Demographic, Clinical, And Chronic Disease Factors, Livia Clarete

Dissertations, Theses, and Capstone Projects

This study explores COVID-19 clinical outcomes in Mexico, focusing on demographic, clinical, and chronic disease variables to develop predictive models. In the binary classification task, the Ada Boost Classifier distinguishes survivors from non-survivors, with age, sex, ethnicity, and chronic medical conditions influencing outcomes. In multiclass classification, the Gradient Boosting Classifier categorizes patients into outcome groups.

Demographic variables, especially age, are crucial for predicting COVID-19 outcomes for both the binary and multiclass classification tasks. Clinical information about previous conditions, including chronic diseases, also holds relevance, especially diabetes, immunocompromise, and cardiovascular diseases. These insights inform public health measures and healthcare strategies, emphasizing …


A Causal Inference Approach For Spike Train Interactions, Zach Saccomano 2024 The Graduate Center, City University of New York

A Causal Inference Approach For Spike Train Interactions, Zach Saccomano

Dissertations, Theses, and Capstone Projects

Since the 1960s, neuroscientists have worked on the problem of estimating synaptic properties, such as connectivity and strength, from simultaneously recorded spike trains. Recent years have seen renewed interest in the problem coinciding with rapid advances in experimental technologies, including an approximate exponential increase in the number of neurons that can be recorded in parallel and perturbation techniques such as optogenetics that can be used to calibrate and validate causal hypotheses about functional connectivity. This thesis presents a mathematical examination of synaptic inference from two perspectives: (1) using in vivo data and biophysical models, we ask in what cases the …


Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown 2024 Purdue University Fort Wayne

Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown

The Journal of Purdue Undergraduate Research

No abstract provided.


Sensitivity Analysis Of Prior Distributions In Regression Model Estimation, AYOADE I ADEWOLE, OLUWATOYIN K. BODUNWA 2024 Department of Mathematics, Tai Solarin University of Education Ijagun Ogun State Nigeria.

Sensitivity Analysis Of Prior Distributions In Regression Model Estimation, Ayoade I Adewole, Oluwatoyin K. Bodunwa

Al-Bahir Journal for Engineering and Pure Sciences

Bayesian inferences depend solely on specification and accuracy of likelihoods and prior distributions of the observed data. The research delved into Bayesian estimation method of regression models to reduce the impact of some of the problems, posed by convectional method of estimating regression models, such as handling complex models, availability of small sample sizes and inclusion of background information in the estimation procedure. Posterior distributions are based on prior distributions and the data accuracy, which is the fundamental principles of Bayesian statistics to produce accurate final model estimates. Sensitivity analysis is an essential part of mathematical model validation in obtaining …


Statistical Consulting In Academia: A Review, Ke Xiao 2024 University of Windsor

Statistical Consulting In Academia: A Review, Ke Xiao

Major Papers

This paper reviews the state of statistical consulting in academia by performing a literature review on this topic in chapters 1 and 2. Chapter 1 overviews general aspects of statistical consulting and types of centers that conduct such services in academia. In Chapter 2 we summarise the literature about the common logistics and processes for conducting statistical consulting in academia. In Chapters 3 and 4, we analyze data on statistical consulting centers for the largest 100 universities in the USA. We also review the literature on the future of statistical consulting in academia in the era of big data and …


Time Scale Theory On Stability Of Explicit And Implicit Discrete Epidemic Models: Applications To Swine Flu Outbreak, Gülşah Yeni, Elvan Akın, Naveen K. Vaidya 2024 Missouri University of Science and Technology

Time Scale Theory On Stability Of Explicit And Implicit Discrete Epidemic Models: Applications To Swine Flu Outbreak, Gülşah Yeni, Elvan Akın, Naveen K. Vaidya

Mathematics and Statistics Faculty Research & Creative Works

Time scales theory has been in use since the 1980s with many applications. Only very recently, it has been used to describe within-host and between-hosts dynamics of infectious diseases. In this study, we present explicit and implicit discrete epidemic models motivated by the time scales modeling approach. We use these models to formulate the basic reproduction number, which determines whether an outbreak occurs, or the disease dies out. We discuss the stability of the disease-free and endemic equilibrium points using the linearization method and Lyapunov function. Furthermore, we apply our models to swine flu outbreak data to demonstrate that the …


Formulating An Efficient Statistical Test Using The Goodness Of Fit Approach With Applications To Real-Life Data, S. A. Qaid, S. E. Abo Youssef Prof., Mahmoud Mansour 2024 Department of Mathematics, Faculty of Education, Abyan University, Abyan, Yemen

Formulating An Efficient Statistical Test Using The Goodness Of Fit Approach With Applications To Real-Life Data, S. A. Qaid, S. E. Abo Youssef Prof., Mahmoud Mansour

Basic Science Engineering

Statistical tests are very important for researchers to make decisions. In particular, when the tests are non-parametric, they are of greater importance because they can be applied to a wide range of data sets regardless of knowing the distribution of these data. Researchers are therefore racing to obtain efficient tests for making good decisions based on the results of these tests. In this study, NBU (2)L was used based on the goodness of fit approach to present an efficient statistical test. The efficiency of the proposed test was computed, and the results were compared to those of other tests. Critical …


On A Multivalued Prescribed Mean Curvature Problem And Inclusions Defined On Dual Spaces, Vy Khoi Le 2024 Missouri University of Science and Technology

On A Multivalued Prescribed Mean Curvature Problem And Inclusions Defined On Dual Spaces, Vy Khoi Le

Mathematics and Statistics Faculty Research & Creative Works

This article addresses two main objectives. First, it establishes a functional analytic framework and presents existence results for a quasilinear inclusion describing a prescribed mean curvature problem with homogeneous Dirichlet boundary conditions, involving a multivalued lower order term. The formulation of the problem is done in the space of functions with bounded variation. The second objective is to introduce a general existence theory for inclusions defined on nonreflexive Banach spaces, which is specifically applicable to the aforementioned prescribed mean curvature problem. This problem can be formulated as a multivalued variational inequality in the space of functions with bounded variation, which, …


Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe 2024 University of Central Florida

Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe

Data Science and Data Mining

Cyberbullying refers to the act of bullying using electronic means and the internet. In recent years, this act has been identifed to be a major problem among young people and even adults. It can negatively impact one’s emotions and lead to adverse outcomes like depression, anxiety, harassment, and suicide, among others. This has led to the need to employ machine learning techniques to automatically detect cyberbullying and prevent them on various social media platforms. In this study, we want to analyze the combination of some Natural Language Processing (NLP) algorithms (such as Bag-of-Words and TFIDF) with some popular machine learning …


Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe 2024 University of Central Florida

Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe

Data Science and Data Mining

This project estimates a regression model to predict the superconducting critical temperature based on variables extracted from the superconductor’s chemical formula. The regression model along with the stepwise variable selection gives a reasonable and good predictive model with a lower prediction error (MSE). Variables extracted based on atomic radius, valence, atomic mass and thermal conductivity appeared to have the most contribution to the predictive model.


A Bayesian Inversion For Emissions And Export Productivity Across The End-Cretaceous Boundary, Alexander A. Cox 2024 Dartmouth College

A Bayesian Inversion For Emissions And Export Productivity Across The End-Cretaceous Boundary, Alexander A. Cox

Dartmouth College Master’s Theses

The end-Cretaceous mass extinction was marked by both the Chicxulub impact and the ongoing emplacement of the Deccan Traps flood basalt province. Both of these events perturbed the environment by the emission of climate-active volatiles, primarily CO2 and SO2. To understand the mechanism of extinction, we must disentangle the timing, duration, and intensity of volcanic and meteoritic environmental forcings. In this thesis, we used a parallel Markov chain Monte Carlo approach to invert for the aforementioned volatile emissions, export productivity, and remineralization from 67 to 65 million years ago using the LOSCAR (Long-term Ocean-atmosphere-Sediment CArbon cycle Reservoir) model. The parallel …


Multiple Imputation For Robust Cluster Analysis To Address Missingness In Medical Data, Arnold Harder, Gayla R. Olbricht, Godwin Ekuma, Daniel B. Hier, Tayo Obafemi-Ajayi 2024 Missouri University of Science and Technology

Multiple Imputation For Robust Cluster Analysis To Address Missingness In Medical Data, Arnold Harder, Gayla R. Olbricht, Godwin Ekuma, Daniel B. Hier, Tayo Obafemi-Ajayi

Mathematics and Statistics Faculty Research & Creative Works

Cluster Analysis Has Been Applied To A Wide Range Of Problems As An Exploratory Tool To Enhance Knowledge Discovery. Clustering Aids Disease Subtyping, I.e. Identifying Homogeneous Patient Subgroups, In Medical Data. Missing Data Is A Common Problem In Medical Research And Could Bias Clustering Results If Not Properly Handled. Yet, Multiple Imputation Has Been Under-Utilized To Address Missingness, When Clustering Medical Data. Its Limited Integration In Clustering Of Medical Data, Despite The Known Advantages And Benefits Of Multiple Imputation, Could Be Attributed To Many Factors. This Includes Methodological Complexity, Difficulties In Pooling Results To Obtain A Consensus Clustering, Uncertainty Regarding …


Open Diameter Maps On Suspensions, Hussam Abobaker, Włodzimierz J. Charatonik, Robert Paul Roe 2024 Missouri University of Science and Technology

Open Diameter Maps On Suspensions, Hussam Abobaker, Włodzimierz J. Charatonik, Robert Paul Roe

Mathematics and Statistics Faculty Research & Creative Works

It is shown that if X is a metric continuum, which admits an open diameter map, then the suspension of X, admits an open diameter map. As a corollary, we have that all spheres admit open diameter maps.


Effects Of Voice Pitch On Social Perceptions Vary With Relational Mobility And Homicide Rate, Toe AUNG, et. al 2024 Singapore Management University

Effects Of Voice Pitch On Social Perceptions Vary With Relational Mobility And Homicide Rate, Toe Aung, Et. Al

Research Collection School of Social Sciences

Fundamental frequency (fo) is the most perceptually salient vocal acoustic parameter, yet little is known about how its perceptual influence varies across societies. We examined how fo affects key social perceptions and how socioecological variables modulate these effects in 2,647 adult listeners sampled from 44 locations across 22 nations. Low male fo increased men’s perceptions of formidability and prestige, especially in societies with higher homicide rates and greater relational mobility in which male intrasexual competition may be more intense and rapid identification of highstatus competitors may be exigent. High female fo increased women’s perceptions of flirtatiousness where relational mobility was …


Digital Commons powered by bepress