"A Comparison Of Variable Selection Methods Using Bootstrap Samples From Environmental Metal Mixture Data", 2020 University Of New Mexico

#### "A Comparison Of Variable Selection Methods Using Bootstrap Samples From Environmental Metal Mixture Data", Paul-Yvann Djamen 4785403, Paul-Yvann Djamen

*Mathematics & Statistics ETDs*

In this thesis, I studied a newly developed variable selection method SODA, and three customarily used variable selection methods: LASSO, Elastic net, and Random forest for environmental mixture data. The motivating datasets have neuro-developmental status as responses and metal measurements and demographic variables as covariates. The challenges for variable selections include (1) many measured metal concentrations are highly correlated, (2) there are many possible ways of modeling interactions among the metals, (3) the relationships between the outcomes and explanatory variables are possibly nonlinear, (4) the signal to noise ratio in the real data may be low. To compare these methods ...

Comparison Of Scale Identification Methods In Mixture Irt Models, 2020 University of Alabama

#### Comparison Of Scale Identification Methods In Mixture Irt Models, Youn-Jeng Choi, Allan S. Cohen

*Journal of Modern Applied Statistical Methods*

The effects of three scale identification constraints in mixture IRT models were studied. A simulation study found no constraint effect on the mixture Rasch and mixture 2PL models, but the item anchoring constraint was the only one that worked well on selecting correct model with the mixture 3PL model.

On Variable Selections In High-Dimensional Incomplete Data, 2020 Department of Mathematics and Statisticshigh-dimensional data; missing value; variable selection; missForest; self-training selection; random lasso; stability selection; Meta-analysis

#### On Variable Selections In High-Dimensional Incomplete Data, Tao Sun

*Major Papers*

Modern Statistics has entered the era of Big Data, wherein data sets are too large, high-dimensional, incomplete and complex for most classical statistical methods. This analysis of Big data firstly focuses on missing data. We compare different multiple imputation methods. Combining the characteristics of medical high-throughput experiments, we compared multivariate imputation by chained equations (MICE), missing forest (missForest), as well as self-training selection (STS) methods. A phenotypic data set of common lung disease was assessed. Moreover, in terms of improving the interpretability and predictability of the model, variable selection plays a pivotal role in the following analysis. Taking the Lasso-Poisson ...

A Note On Inferences About The Probability Of Success, 2020 University of Southern California

#### A Note On Inferences About The Probability Of Success, Rand Wilcox

*Journal of Modern Applied Statistical Methods*

There is an extensive literature dealing with inferences about the probability of success. A minor goal in this note is to point out when certain recommended methods can be unsatisfactory when the sample size is small. The main goal is to report results on the two-sample case. Extant results suggest using one of four methods. The results indicate when computing a 0.95 confidence interval, two of these methods can be more satisfactory when dealing with small sample sizes.

At The Interface Of Algebra And Statistics, 2020 The Graduate Center, City University of New York

#### At The Interface Of Algebra And Statistics, Tai-Danae Bradley

*All Dissertations, Theses, and Capstone Projects*

This thesis takes inspiration from quantum physics to investigate mathematical structure that lies at the interface of algebra and statistics. The starting point is a passage from classical probability theory to quantum probability theory. The quantum version of a probability distribution is a density operator, the quantum version of marginalizing is an operation called the partial trace, and the quantum version of a marginal probability distribution is a reduced density operator. Every joint probability distribution on a finite set can be modeled as a rank one density operator. By applying the partial trace, we obtain reduced density operators whose diagonals ...

Research In Short Term Actuarial Modeling, 2020 California State University, San Bernardino

#### Research In Short Term Actuarial Modeling, Elijah Howells

*Electronic Theses, Projects, and Dissertations*

This paper covers mathematical methods used to conduct actuarial analysis in the short term, such as policy deductible analysis, maximum covered loss analysis, and mixtures of distributions. Assessment of a loss variable's distribution under the effect of a policy deductible, as well as one with an implemented maximum covered loss, and under both a policy deductible and maximum covered loss will also be covered. The derivation, meaning, and use of cost per loss and cost per payment will be discussed, as will those of an aggregate sum distribution, stop loss policy, and maximum likelihood estimation. For each topic, special ...

Comparing Means Under Heteroscedasticity And Nonnormality: Further Exploring Robust Means Modeling, 2020 York University, Toronto

#### Comparing Means Under Heteroscedasticity And Nonnormality: Further Exploring Robust Means Modeling, Alyssa Counsell, Robert Philip Chalmers, Robert A. Cribbie

*Journal of Modern Applied Statistical Methods*

Comparing the means of independent groups is a concern when the assumptions of normality and variance homogeneity are violated. Robust means modeling (RMM) was proposed as an alternative to ANOVA-type procedures when the assumptions of normality and variance homogeneity are violated. The purpose of this study is to compare the Type I error and power rates of RMM to the trimmed Welch procedure. A Monte Carlo study was used to investigate RMM and the trimmed Welch procedure under several conditions of nonnormality and variance heterogeneity. The results suggest that the trimmed Welch provides a better balance of Type I error ...

Inferences About The Probability Of Success, Given The Value Of A Covariate, Using A Nonparametric Smoother, 2020 University of Southern California

#### Inferences About The Probability Of Success, Given The Value Of A Covariate, Using A Nonparametric Smoother, Rand Wilcox

*Journal of Modern Applied Statistical Methods*

For a binary random variable *Y*, let *p*(*x*) = *P*(*Y* = 1 | *X* = *x*) for some covariate *X*. The goal of computing a confidence interval for *p*(*x*) is considered. In the logistic regression model, even a slight departure difficult to detect via a goodness-of-fit test can yield inaccurate results. The accuracy of a confidence interval can deteriorate as the sample size increases. The goal is to suggest an alternative approach based on a smoother, which provides a more flexible approximation of *p*(*x*).

Integrated Multiple Mediation Analysis: A Robustness–Specificity Trade-Off In Causal Structure, 2020 Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan.

#### Integrated Multiple Mediation Analysis: A Robustness–Specificity Trade-Off In Causal Structure, An-Shun Tai, Sheng-Hsuan Lin

*Harvard University Biostatistics Working Paper Series*

Recent methodological developments in causal mediation analysis have addressed several issues regarding multiple mediators. However, these developed methods differ in their definitions of causal parameters, assumptions for identification, and interpretations of causal effects, making it unclear which method ought to be selected when investigating a given causal effect. Thus, in this study, we construct an integrated framework, which unifies all existing methodologies, as a standard for mediation analysis with multiple mediators. To clarify the relationship between existing methods, we propose four strategies for effect decomposition: two-way, partially forward, partially backward, and complete decompositions. This study reveals how the direct and ...

Waiting-Time Paradox In 1922, 2020 University at Buffalo

#### Waiting-Time Paradox In 1922, Naoki Masuda, Takayuki Hiraoka

*Northeast Journal of Complex Systems (NEJCS)*

We present an English translation and discussion of an essay that a Japanese physicist, Torahiko Terada, wrote in 1922. In the essay, he described the waiting-time paradox, also called the bus paradox, which is a known mathematical phenomenon in queuing theory, stochastic processes, and modern temporal network analysis. He also observed and analyzed data on Tokyo City trams to verify the relevance of the waiting-time paradox to busy passengers in Tokyo at the time. This essay seems to be one of the earliest documentations of the waiting-time paradox in a sufficiently scientific manner.

Metabolomic Profiling Of Nicotiana Spp. Nectars Indicate That Pollinator Feeding Preference Is A Stronger Determinant Than Plant Phylogenetics In Shaping Nectar Diversity, 2020 Iowa State University

#### Metabolomic Profiling Of Nicotiana Spp. Nectars Indicate That Pollinator Feeding Preference Is A Stronger Determinant Than Plant Phylogenetics In Shaping Nectar Diversity, Fredy A. Silva, Elizabeth C. Chatt, Siti-Nabilla Mahalim, Adel Guirgis, Xingche Guo, Dan S. Nettleton, Basil J. Nikolau, Robert W. Thornburg

*Statistics Publications*

Floral nectar is a rich secretion produced by the nectary gland and is offered as reward to attract pollinators leading to improved seed set. Nectars are composed of a complex mixture of sugars, amino acids, proteins, vitamins, lipids, organic and inorganic acids. This composition is influenced by several factors, including floral morphology, mechanism of nectar secretion, time of flowering, and visitation by pollinators. The objective of this study was to determine the contributions of flowering time, plant phylogeny, and pollinator selection on nectar composition in Nicotiana. The main classes of nectar metabolites (sugars and amino acids) were quantified using gas ...

On Statistical Significance Of Discriminant Function Coefficients, 2020 University of Calgary

#### On Statistical Significance Of Discriminant Function Coefficients, Tolulope T. Sajobi, Gordon H. Fick, Lisa M. Lix

*Journal of Modern Applied Statistical Methods*

Discriminant function coefficients are useful for describing group differences and identifying variables that distinguish between groups. Test procedures were compared based on asymptotically approximations, empirical, and exact distributions for testing hypotheses about discriminant function coefficients. These tests are useful for assessing variable importance in multivariate group designs.

Sensitivity Analysis For Incomplete Data And Causal Inference, 2020 Southern Methodist University

#### Sensitivity Analysis For Incomplete Data And Causal Inference, Heng Chen

*Statistical Science Theses and Dissertations*

In this dissertation, we explore sensitivity analyses under three different types of incomplete data problems, including missing outcomes, missing outcomes and missing predictors, potential outcomes in \emph{Rubin causal model (RCM)}. The first sensitivity analysis is conducted for the \emph{missing completely at random (MCAR)} assumption in frequentist inference; the second one is conducted for the \emph{missing at random (MAR)} assumption in likelihood inference; the third one is conducted for one novel assumption, the ``sixth assumption'' proposed for the robustness of instrumental variable estimand in causal inference.

Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, 2020 Southern Methodist University

#### Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, Lochana Palayangoda

*Statistical Science Theses and Dissertations*

For degradation data in reliability analysis, estimation of the first-passage time (FPT) distribution to a threshold provides valuable information on reliability characteristics. Recently, Balakrishnan and Qin (2019; Applied Stochastic Models in Business and Industry, 35:571-590) studied a nonparametric method to approximate the FPT distribution of such degradation processes if the underlying process type is unknown. In this thesis, we propose improved techniques based on saddlepoint approximation, which enhance upon their suggested methods. Numerical examples and Monte Carlo simulation studies are used to illustrate the advantages of the proposed techniques. Limitations of the improved techniques are discussed and some possible ...

Evaluation Of The Utility Of Informative Priors In Bayesian Structural Equation Modeling With Small Samples, 2020 Southern Methodist University

#### Evaluation Of The Utility Of Informative Priors In Bayesian Structural Equation Modeling With Small Samples, Hao Ma

*Department of Education Policy and Leadership Theses and Dissertations*

The estimation of parameters in structural equation modeling (SEM) has been primarily based on the maximum likelihood estimator (MLE) and relies on large sample asymptotic theory. Consequently, the results of the SEM analyses with small samples may not be as satisfactory as expected. In contrast, informative priors typically do not require a large sample, and they may be helpful for improving the quality of estimates in the SEM models with small samples. However, the role of informative priors in the Bayesian SEM has not been thoroughly studied to date. Given the limited body of evidence, specifying effective informative priors remains ...

Statistical Inference Of Adaptation At Multiple Genomic Scales Using Supervised Classification And A Hidden Markov Model, 2020 Duquesne University

#### Statistical Inference Of Adaptation At Multiple Genomic Scales Using Supervised Classification And A Hidden Markov Model, Lauren A. Sugden

*Biology and Medicine Through Mathematics Conference*

No abstract provided.

An Improved Two Independent-Samples Randomization Test For Single-Case Ab-Type Intervention Designs: A 20-Year Journey, 2020 University of Arizona

#### An Improved Two Independent-Samples Randomization Test For Single-Case Ab-Type Intervention Designs: A 20-Year Journey, Joel R. Levin, John M. Ferron, Boris S. Gafurov

*Journal of Modern Applied Statistical Methods*

Detailed is a 20-year arduous journey to develop a statistically viable two-phase (AB) single-case two independent-samples randomization test procedure. The test is designed to compare the effectiveness of two different interventions that are randomly assigned to cases. In contrast to the unsatisfactory simulation results produced by an earlier proposed randomization test, the present test consistently exhibited acceptable Type I error control under various design and effect-type configurations, while at the same time possessing adequate power to detect moderately sized intervention-difference effects. Selected issues, applications, and a multiple-baseline extension of the two-sample test are discussed.

Support Vector Machine-Based Modified Sp Statistic For Subset Selection With Non-Normal Error Terms, 2020 Department of Statistics, Gopal Krishna Gokhale College, Kolhapur (MS), India.

#### Support Vector Machine-Based Modified Sp Statistic For Subset Selection With Non-Normal Error Terms, Shivaji Shripati Desai, D N. Kashid

*Journal of Modern Applied Statistical Methods*

Support vector machine (SVM) is used for estimation of regression parameters to modify the sum of cross products (Sp). It works well for some nonnormal error distributions. The performance of existing robust methods and the modified Sp is evaluated through simulated and real data. The results show the performance of the modified Sp is good.

Recurrence Relations For Marginal And Joint Moment Generating Functions Of Topp-Leone Generated Exponential Distribution Based On Record Values And Its Characterization, 2020 Aligarh Muslim University

#### Recurrence Relations For Marginal And Joint Moment Generating Functions Of Topp-Leone Generated Exponential Distribution Based On Record Values And Its Characterization, Zaki Anwar, Neetu Gupta, Mohd Akram Raza Khan, Qazi Azhad Jamal

*Journal of Modern Applied Statistical Methods*

The exact expressions and some recurrence relations are derived for marginal and joint moment generating functions of *k*^{th} lower record values from Topp-Leone Generated (TLG) Exponential distribution. This distribution is characterized by using the recurrence relation of the marginal moment generating function of *k*^{th} lower record values.

A New Exponential Approach For Reducing The Mean Squared Errors Of The Estimators Of Population Mean Using Conventional And Non-Conventional Location Parameters, 2020 Vikram University, Ujjain, India

#### A New Exponential Approach For Reducing The Mean Squared Errors Of The Estimators Of Population Mean Using Conventional And Non-Conventional Location Parameters, Housila P. Singh, Anita Yadav

*Journal of Modern Applied Statistical Methods*

Classes of ratio-type estimators t (say) and ratio-type exponential estimators *t*_{e} (say) of the population mean are proposed, and their biases and mean squared errors under large sample approximation are presented. It is the class of ratio-type exponential estimators *t*_{e} provides estimators more efficient than the ratio-type estimators.