Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

11,701 Full-Text Articles 18,348 Authors 6,525,218 Downloads 279 Institutions

All Articles in Statistics and Probability

Faceted Search

11,701 full-text articles. Page 1 of 401.

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock 2023 Utah State University

Stressor: An R Package For Benchmarking Machine Learning Models, Samuel A. Haycock

All Graduate Theses and Dissertations

Many discipline specific researchers need a way to quickly compare the accuracy of their predictive models to other alternatives. However, many of these researchers are not experienced with multiple programming languages. Python has recently been the leader in machine learning functionality, which includes the PyCaret library that allows users to develop high-performing machine learning models with only a few lines of code. The goal of the stressor package is to help users of the R programming language access the advantages of PyCaret without having to learn Python. This allows the user to leverage R’s powerful data analysis workflows, while simultaneously …


Statistical Graph Quality Analysis Of Utah State University Master Of Science Thesis Reports, Ragan Astle 2023 Utah State University

Statistical Graph Quality Analysis Of Utah State University Master Of Science Thesis Reports, Ragan Astle

All Graduate Theses and Dissertations

Graphical software packages have become increasingly popular in our modern world, but there are concerns within the statistical visualization field about the default settings provided by these packages, which can make it challenging to create good quality graphs that align with standard graph principles. In this thesis, we investigate whether the quality of graphs from Utah State University (USU) Plan A Master of Science (MS) thesis reports from the years 1930 to 2019 was affected by the rise of graphical software packages. We collected all data stored on the USU Digital Commons website since November 2021 to determine the specific …


Using Natural Language Processing To Quantify The Efficacy Of Language Simplification As A Communication Strategy, Brian Nalley 2023 Utah State University

Using Natural Language Processing To Quantify The Efficacy Of Language Simplification As A Communication Strategy, Brian Nalley

All Graduate Theses and Dissertations

People with communication disorders often experience difficulties being understood by unfamiliar listeners or in noisy environments. A common strategy for effectively communicating in these scenarios is to use simpler and more predictable language. Despite the prevalence of this strategy, there has been little to no research to date focused on the effectiveness of language simplification as a communication strategy. This study seeks to begin filling that gap by using natural language processing to determine whether speakers with early-stage Parkinson’s disease and age-matched neurotypical speakers are able to successfully simplify their language while still maintaining the original message.

Simplification was measured …


An Interval-Valued Random Forests, Paul Gaona Partida 2023 Utah State University

An Interval-Valued Random Forests, Paul Gaona Partida

All Graduate Theses and Dissertations

There is a growing demand for the development of new statistical models and the refinement of established methods to accommodate different data structures. This need arises from the recognition that traditional statistics often assume the value of each observation to be precise, which may not hold true in many real-world scenarios. Factors such as the collection process and technological advancements can introduce imprecision and uncertainty into the data.

For example, consider data collected over a long period of time, where newer measurement tools may offer greater accuracy and provide more information than previous methods. In such cases, it becomes crucial …


Sentiment Analysis Before And During The Covid-19 Pandemic, Emily Musgrove 2023 Ursinus College

Sentiment Analysis Before And During The Covid-19 Pandemic, Emily Musgrove

Mathematics Summer Fellows

This study examines the change in connotative language use before and during the Covid-19 pandemic. By analyzing news articles from several major US newspapers, we found that there is a statistically significant correlation between the sentiment of the text and the publication period. Specifically, we document a large, systematic, and statistically significant decline in the overall sentiment of articles published in major news outlets. While our results do not directly gauge the sentiment of the population, our findings have important implications regarding the social responsibility of journalists and media outlets especially in times of crisis.


A Multivariate Investigation Of The Motivational, Academic, And Well-Being Characteristics Of First-Generation And Continuing-Generation College Students, Christopher L. Thomas, Staci Zolkoski 2023 The University of Texas at Tyler

A Multivariate Investigation Of The Motivational, Academic, And Well-Being Characteristics Of First-Generation And Continuing-Generation College Students, Christopher L. Thomas, Staci Zolkoski

Journal of Research Initiatives

Prior research has noted differences in motivational, academic, and well-being factors between first-generation and continuing-education students. However, past investigations have primarily overlooked the interactive influence of protective and risk factors when comparing the characteristics of first-generation and continuing-education students. Thus, the current study adopted a multivariate approach to gain a more nuanced understanding of the influence of generational status on students' self-regulated learning capabilities, academic anxiety, sense of belonging, academic barriers, mental health concerns, and satisfaction with life. University students (N = 432, 67.46% Caucasian, 87.55% female, Age = 28.10 ± 9.46) completed the Cognitive Test Anxiety Scale-2nd …


A Comparison Of Confidence Intervals In State Space Models, Jinyu Du 2023 Southern Methodist University

A Comparison Of Confidence Intervals In State Space Models, Jinyu Du

Statistical Science Theses and Dissertations

This thesis develops general procedures for constructing confidence intervals (CIs) of the error disturbance parameters (standard deviations) and transformations of the error disturbance parameters in time-invariant state space models (ssm). With only a set of observations, estimating individual error disturbance parameters accurately in the presence of other unknown parameters in ssm is a very challenging problem. We attempted to construct four different types of confidence intervals, Wald, likelihood ratio, score, and higher-order asymptotic intervals for both the simple local level model and the general time-invariant state space models (ssm). We show that for a simple local level model, both the …


Optimal Experimental Planning Of Reliability Experiments Based On Coherent Systems, Yang Yu 2023 Southern Methodist University

Optimal Experimental Planning Of Reliability Experiments Based On Coherent Systems, Yang Yu

Statistical Science Theses and Dissertations

In industrial engineering and manufacturing, assessing the reliability of a product or system is an important topic. Life-testing and reliability experiments are commonly used reliability assessment methods to gain sound knowledge about product or system lifetime distributions. Usually, a sample of items of interest is subjected to stresses and environmental conditions that characterize the normal operating conditions. During the life-test, successive times to failure are recorded and lifetime data are collected. Life-testing is useful in many industrial environments, including the automobile, materials, telecommunications, and electronics industries.

There are different kinds of life-testing experiments that can be applied for different purposes. …


On Image Response Regression With High-Dimensional Data, Noah Fuerth 2023 University of Windsor

On Image Response Regression With High-Dimensional Data, Noah Fuerth

Major Papers

A recent issue in statistical analysis is modelling data when the effect variable

changes at different locations. This can be difficult to accomplish when the dimensions

of the covariates are very high, and when the domain of the varying coefficient

functions of predictors are not necessarily regular. This research paper will investigate

a method to overcome these challenges by approximating the varying coefficient

functions using bivariate splines. We do this by splitting the domain of the varying

coefficient functions into a number of triangles, and build the bivariate spline functions

based on this triangulation. This major paper will outline detailed …


On Maximum Likelihood Estimators For A Jump-Type Affine Diffusion Two-Factor Model, Jiaming Yin Mr. 2023 University of Windsor

On Maximum Likelihood Estimators For A Jump-Type Affine Diffusion Two-Factor Model, Jiaming Yin Mr.

Major Papers

We consider a jump-type two-factor affine diffusion model driven by a subordinator in the context of continuous time observations. We study the asymptotic properties of the maximum likelihood estimator (MLE) for the drift parameters. In particular, we prove the strong consistency and the asymptotic normality of MLE in the subcritical case. We also present some numerical illustrations to confirm the theoretical results. The main difficulty of this major paper consists in proving the ergodicity of the model in the subcritical case and deriving the limiting behavior of the process.


Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici 2023 The University of Western Ontario

Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici

Electronic Thesis and Dissertation Repository

Mark-recapture (MR) models typically assume that individuals under study have independent survival and recapture outcomes. One such model of interest is known as the Cormack-Jolly-Seber (CJS) model. In this dissertation, we conduct three major research projects focused on studying the impact of violating the independence assumption in MR models along with presenting extensions which relax the independence assumption. In the first project, we conduct a simulation study to address the impact of failing to account for pair-bonded animals having correlated recapture and survival fates on the CJS model. We examined the impact of correlation on the likelihood ratio test (LRT), …


A Characterization Of Complex-Valued Random Variables With Rotationally-Invariant Moments, Michael L. Maiello 2023 Texas A&M, College Station

A Characterization Of Complex-Valued Random Variables With Rotationally-Invariant Moments, Michael L. Maiello

Rose-Hulman Undergraduate Mathematics Journal

A complex-valued random variable Z is rotationally invariant if the moments of Z are the same as the moments of W=e^{i*theta}Z. In the first part of the article, we characterize such random variables, in terms of "vanishing unbalanced moments," moment and cumulant generating functions, and polar decomposition. In the second part, we consider random variables whose moments are not necessarily finite, but which have a density. In this setting, we prove two characterizations that are equivalent to rotational invariance, one involving polar decomposition, and the other involving entropy. If a random variable has both a density and moments which determine …


Creating Regression Model For Non-Markov Transition Probability Using Pseudo-Observations, Michael Gray 2023 Portland State University

Creating Regression Model For Non-Markov Transition Probability Using Pseudo-Observations, Michael Gray

Dissertations and Theses

A multi-state model is a graphical tool widely used to illustrate a transitional relationship between states in many applications. We will study the transition probabilities of an illness-death model, which is an example of a multi-state model. We will investigate transition probabilities using a counting process approach. Aalen-Johansen estimator is the gold-standard in estimating a transition probability. However, Aalen-Johansen estimator may be biased when the Markov assumption is violated. Therefore, Aalen-Johansen estimator is an unreliable estimator when the Markov assumption is violated. Several papers have published non-parametric estimators that accommodate for non-Markov models using a counting process approach.

Furthermore, there …


Identifying Advantages To Teaching Linear Regression In A Modeling And Simulation Introductory Statistics Curriculum, Kit Harris Clement 2023 Portland State University

Identifying Advantages To Teaching Linear Regression In A Modeling And Simulation Introductory Statistics Curriculum, Kit Harris Clement

Dissertations and Theses

Statistical association is a key facet of statistical literacy: claims based on relationships between variables or ideas rooted in data are found everywhere in media and discourse. A key development in introductory statistics curricula is the use of simulation-based inference, which has shown positive outcomes for students, especially in regards to statistical literacy and conceptual understanding. In this dissertation project, I investigate students from the Change Agents for the Teaching and Learning of STatistics (CATALST) curriculum in activities I designed for learning statistical association and linear regression. First, I analyzed the informal line fitting strategies of CATALST students. Findings suggest …


Modeling And A Domain Decomposition Method With Finite Element Discretization For Coupled Dual-Porosity Flow And Navier–Stokes Flow, Jiangyong Hou, Dan Hu, Xuejian Li, Xiaoming He 2023 Missouri University of Science and Technology

Modeling And A Domain Decomposition Method With Finite Element Discretization For Coupled Dual-Porosity Flow And Navier–Stokes Flow, Jiangyong Hou, Dan Hu, Xuejian Li, Xiaoming He

Mathematics and Statistics Faculty Research & Creative Works

In This Paper, We First Propose and Analyze a Steady State Dual-Porosity-Navier–Stokes Model, Which Describes Both Dual-Porosity Flow and Free Flow (Governed by Navier–Stokes Equation) Coupled through Four Interface Conditions, Including the Beavers–Joseph Interface Condition. Then We Propose a Domain Decomposition Method for Efficiently Solving Such a Large Complex System. Robin Boundary Conditions Are Used to Decouple the Dual-Porosity Equations from the Navier–Stokes Equations in the Coupled System. based on the Two Decoupled Sub-Problems, a Parallel Robin-Robin Domain Decomposition Method is Constructed and Then Discretized by Finite Elements. We Analyze the Convergence of the Domain Decomposition Method with the Finite …


Statistical And Biological Analyses Of Acoustic Signals In Estrildid Finches, Moises Rivera 2023 The Graduate Center, City University of New York

Statistical And Biological Analyses Of Acoustic Signals In Estrildid Finches, Moises Rivera

Dissertations, Theses, and Capstone Projects

Acoustic communication is a process that involves auditory perception and signal processing. Discrimination and recognition further require cognitive processes and supporting mechanisms in order to successfully identify and appropriately respond to signal senders. Although acoustic communication is common across birds, classical research has largely disregarded the perceptual abilities of perinatal altricial taxa. Chapter 1 reviews the literature of perinatal acoustic stimulation in birds, highlighting the disproportionate focus on precocial birds (e.g., chickens, ducks, quails). The long-held belief that altricial birds were incapable of acoustic perception in ovo was only recently overturned, as researchers began to find behavioral and physiological evidence …


(R2051) Analysis Of Map/Ph1, Ph2/2 Queueing Model With Working Breakdown, Repairs, Optional Service, And Balking, G. Ayyappan, G. Archana 2023 Puducherry Technological University

(R2051) Analysis Of Map/Ph1, Ph2/2 Queueing Model With Working Breakdown, Repairs, Optional Service, And Balking, G. Ayyappan, G. Archana

Applications and Applied Mathematics: An International Journal (AAM)

In this paper, a classical queueing system with two types of heterogeneous servers has been considered. The Markovian Arrival Process (MAP) is used for the customer arrival, while phase type distribution (PH) is applicable for the offering of service to customers as well as the repair time of servers. Optional service are provided by the servers to the unsatisfied customers. The server-2 may get breakdown during the busy period of any type of service. Though the server- 2 got breakdown, server-2 has a capacity to provide the service at a slower rate to the current customer who is receiving service …


(R2053) Analysis Of Map/Ph/1 Queueing Model Subject To Two-Stage Vacation Policy With Imperfect Service, Setup Time, Breakdown, Delay Time, Phase Type Repair And Reneging Customer, N. Arulmozhi 2023 Puducherry Technological University

(R2053) Analysis Of Map/Ph/1 Queueing Model Subject To Two-Stage Vacation Policy With Imperfect Service, Setup Time, Breakdown, Delay Time, Phase Type Repair And Reneging Customer, N. Arulmozhi

Applications and Applied Mathematics: An International Journal (AAM)

In this paper, we study a continuous-time single server queueing system with an infinite system of capacity, a two-stage vacation policy with imperfect service, setup, breakdown, delay time, phase-type of repair and customer reneging. The Markovian Arrival Process is used for the arrival of a customer and the phase-type distribution is used when offering service. This encompasses the policy of two vacations: a single working vacation and multiple vacations. Using the Matrix-Analytic Method to approach the system generates an invariant probability vector for this model. Henceforth, the busy period, waiting time distribution and cost analysis are the additional findings. The …


(R2025) Improving The Lda Linear Discriminant Analysis Method By Eliminating Redundant Variables For The Diagnosis Of Covid-19 Patients, Kianoush Fathi Vajargah, Hamid Mottaghi Golshan, Fazel Badakhshan Farahabadi 2023 Islamic Azad University

(R2025) Improving The Lda Linear Discriminant Analysis Method By Eliminating Redundant Variables For The Diagnosis Of Covid-19 Patients, Kianoush Fathi Vajargah, Hamid Mottaghi Golshan, Fazel Badakhshan Farahabadi

Applications and Applied Mathematics: An International Journal (AAM)

Nowadays, with the increase in data production speed, the process of data analysis has faced many problems because this big data is often accompanied by plug-in data and redundant data. Therefore, the use of dimensional methods in the pre-data analysis stage is necessary. In data mining, dimensional reduction is one of the most important steps in data pre-processing. Principal component analysis (PCA) and linear discriminant analysis (LDA) are often used to reduce dimensions in data mining. The LDA method is a monitored and controlled method but the PCA is not controlled method. When the number of samples in classes is …


Population Modeling With Machine Learning Can Enhance Measures Of Mental Health - Open-Data Replication, Ty Easley, Ruiqi Chen, Kayla Hannon, Rosie Dutt, Janine Bijsterbosch 2023 Washington University School of Medicine in St. Louis

Population Modeling With Machine Learning Can Enhance Measures Of Mental Health - Open-Data Replication, Ty Easley, Ruiqi Chen, Kayla Hannon, Rosie Dutt, Janine Bijsterbosch

Statistical and Data Sciences: Faculty Publications

Efforts to predict trait phenotypes based on functional MRI data from large cohorts have been hampered by low prediction accuracy and/or small effect sizes. Although these findings are highly replicable, the small effect sizes are somewhat surprising given the presumed brain basis of phenotypic traits such as neuroticism and fluid intelligence. We aim to replicate previous work and additionally test multiple data manipulations that may improve prediction accuracy by addressing data pollution challenges. Specifically, we added additional fMRI features, averaged the target phenotype across multiple measurements to obtain more accurate estimates of the underlying trait, balanced the target phenotype's distribution …


Digital Commons powered by bepress