Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 10762

Full-Text Articles in Physical Sciences and Mathematics

A Latent Spatial Piecewise Exponential Model For Interval-Censored Disease Surveillance Data With Time-Varying Covariates And Misclassification, Yaxuan Sun, Chong Wang, William Q. Meeker, Max Morris, Marisa L. Rotolo, Jeffery Zimmerman Jan 2019

A Latent Spatial Piecewise Exponential Model For Interval-Censored Disease Surveillance Data With Time-Varying Covariates And Misclassification, Yaxuan Sun, Chong Wang, William Q. Meeker, Max Morris, Marisa L. Rotolo, Jeffery Zimmerman

Veterinary Diagnostic and Production Animal Medicine Publications

Understanding the dynamics of disease spread is critical to achieving effective animal disease surveillance. A major challenge in modeling disease spread is the fact that the true disease status cannot be known with certainty due to the imperfect diagnostic sensitivity and specificity of the tests used to generate the disease surveillance data. Other challenges in modeling such data include interval censoring, relating disease spread to distance between units, and incorporating time-varying covariates, which are the unobserved disease statuses. We propose a latent spatial piecewise exponential model (PEX) with misclassification of events to address the challenges in modeling such disease surveillance ...


Application Of Bradford’S Law Of Scattering On Research Publication In Astronomy & Astrophysics Of India, Satish Kumar, Senthilkumar R. Dec 2018

Application Of Bradford’S Law Of Scattering On Research Publication In Astronomy & Astrophysics Of India, Satish Kumar, Senthilkumar R.

Library Philosophy and Practice (e-journal)

The present study is focused on examining the application of Bradford’s law of scattering on research articles published in the field of Astronomy & Astrophysics by Indian scientist during 1988-2017. The bibliographic data was retrieved from Web of Science (WoS) bibliographic data base for different period of time. Total 18,877 journal’s article have been published by Indian scientist in the field of Astronomy & Astrophysics during 1988-2017 which was further retrieved and analyzed separately for different blocks of 10 years as well as for 30 years consolidated too. The core journal of the field was identified. The Bradford law ...


A Proficient Two-Stage Stratified Randomized Response Strategy, Tanveer A. Tarray, Housila P. Singh Dec 2018

A Proficient Two-Stage Stratified Randomized Response Strategy, Tanveer A. Tarray, Housila P. Singh

Journal of Modern Applied Statistical Methods

A stratified randomized response model based on R. Singh, Singh, Mangat, and Tracy (1995) improved two-stage randomized response strategy is proposed. It has an optimal allocation and large gain in precision. Conditions are obtained under which the proposed model is more efficient than R. Singh et al. (1995) and H. P. Singh and Tarray (2015) models. Numerical illustrations are also given in support of the present study.


Extended Method For Several Dichotomous Covariates To Estimate The Instantaneous Risk Function Of The Aalen Additive Model, Luciane Teixeira Passos Giarola, Mario Javier Ferrua Vivanco, Marcelo Angelo Cirillo, Fortunato Silva Menezes Dec 2018

Extended Method For Several Dichotomous Covariates To Estimate The Instantaneous Risk Function Of The Aalen Additive Model, Luciane Teixeira Passos Giarola, Mario Javier Ferrua Vivanco, Marcelo Angelo Cirillo, Fortunato Silva Menezes

Journal of Modern Applied Statistical Methods

The instantaneous risk function of Aalen’s model is estimated considering dichotomous covariates, using parametric accumulated risk functions to smooth cumulative risk of Aalen by grouping the individuals into sets named parcels. This methodology can be used for data with dichotomous covariates.


Simple Unbalanced Ranked Set Sampling For Mean Estimation Of Response Variable Of Developmental Programs, Girish Chandra, Dinesh S. Bhoj, Rajiv Pandey Dec 2018

Simple Unbalanced Ranked Set Sampling For Mean Estimation Of Response Variable Of Developmental Programs, Girish Chandra, Dinesh S. Bhoj, Rajiv Pandey

Journal of Modern Applied Statistical Methods

An unbalanced ranked set sampling (RSS) procedure on the skewed survey variable is proposed to estimate the population mean of a response variable from the area of developmental programs which are generally implemented under different phases. It is based on the unbalanced RSS under linear impacts of the program and is compared with the estimators based on simple random sampling (SRS) and balanced RSS. It is shown that the relative precision of the proposed estimator is higher than those of the estimators based on SRS and balanced RSS for three chosen skewed distributions of survey variables.


Role Of Misclassification Estimates In Estimating Disease Prevalence And A Non-Linear Approach To Study Synchrony Using Heart Rate Variability In Chickens, Dola Pathak Dec 2018

Role Of Misclassification Estimates In Estimating Disease Prevalence And A Non-Linear Approach To Study Synchrony Using Heart Rate Variability In Chickens, Dola Pathak

Dissertations and Theses in Statistics

Infectious disease assays can be imperfect. When estimating disease prevalence, these imperfections are accounted for by incorporating assay sensitivity and specificity into point and variance estimates. Unfortunately, these accuracy measures are often treated as fixed constants, rather than acknowledging that they are estimates from an assay validation process. The purpose of this study is to show the detrimental effect of not taking into account this sampling variability when samples are obtained through group testing (aka, pooled testing). We show that confidence interval coverage can dramatically decline as the sample size increases for the main sample of interest. As a remedy ...


Comparing Performance Of Gene Set Test Methods Using Biologically Relevant Simulated Data, Richard M. Lambert Dec 2018

Comparing Performance Of Gene Set Test Methods Using Biologically Relevant Simulated Data, Richard M. Lambert

All Graduate Theses and Dissertations

Today we know that there are many genetically driven diseases and health conditions.These problems often manifest only when a set of genes are either active or inactive. Recent technology allows us to measure the activity level of genes in cells, which we call gene expression. It is of great interest to society to be able to statistically compare the gene expression of a large number of genes between two or more groups. For example, we may want to compare the gene expression of a group of cancer patients with a group of non-cancer patients to better understand the genetic ...


Sequential Inference For Hidden Markov Models, Michael Ellis Dec 2018

Sequential Inference For Hidden Markov Models, Michael Ellis

Theses and Dissertations

In many applications data are collected sequentially in time with very short time intervals between observations. If one is interested in using new observations as they arrive in time then non-sequential Bayesian inference methods, such as Markov Chain Monte Carlo (MCMC) sampling, can be too slow. Increasingly, state space models are being used to model nonlinear and non-Gaussian systems. The structure of state space models allows for sequential Bayesian inference so that an approximation to the posterior distribution of interest can be updated as new observations arrive. In special cases, the exact posterior distribution can be updated through conjugate Bayesian ...


Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett Dec 2018

Rfviz: An Interactive Visualization Package For Random Forests In R, Christopher Beckett

All Graduate Plan B and other Reports

Random forests are very popular tools for predictive analysis and data science. They work for both classification (where there is a categorical response variable) and regression (where the response is continuous). Random forests provide proximities, and both local and global measures of variable importance. However, these quantities require special tools to be effectively used to interpret the forest. Rfviz is a sophisticated interactive visualization package and toolkit in R, specially designed for interpreting the results of a random forest in a user-friendly way. Rfviz uses a recently developed R package (loon) from the Comprehensive R Archive Network (CRAN) to create ...


Budget-Constrained Regression Model Selection Using Mixed Integer Nonlinear Programming, Jingying Zhang Dec 2018

Budget-Constrained Regression Model Selection Using Mixed Integer Nonlinear Programming, Jingying Zhang

Theses and Dissertations

Regression analysis fits predictive models to data on a response variable and corresponding values for a set of explanatory variables. Often data on the explanatory variables come at a cost from commercial databases, so the available budget may limit which ones are used in the final model.

In this dissertation, two budget-constrained regression models are proposed for continuous and categorical variables respectively using Mixed Integer Nonlinear Programming (MINLP) to choose the explanatory variables to be included in solutions. First, we propose a budget-constrained linear regression model for continuous response variables. Properties such as solvability and global optimality of the proposed ...


Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall Nov 2018

Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall

UW Biostatistics Working Paper Series

BACKGROUND: National- or regional-scale prediction models that estimate individual-level air pollution concentrations commonly include hundreds of geographic variables. However, these many variables may not be necessary and parsimonious approach including small numbers of variables may achieve sufficient prediction ability. This parsimonious approach can also be applied to most criteria pollutants. This approach will be powerful when generating publicly available datasets of model predictions that support research in environmental health and other fields. OBJECTIVES: We aim to (1) build annual-average integrated empirical geographic (IEG) regression models for the contiguous U.S. for six criteria pollutants, for all years with regulatory monitoring ...


Stochastic Lanczos Likelihood Estimation Of Genomic Variance Components, Richard Border Nov 2018

Stochastic Lanczos Likelihood Estimation Of Genomic Variance Components, Richard Border

Applied Mathematics Graduate Theses & Dissertations

Genomic variance components analysis seeks to estimate the extent to which interindividual variation in a given trait can be attributed to genetic similarity. Likelihood estimation of such models involves computationally expensive operations on large, dense, and unstructured matrices of high rank. As a result, standard estimation procedures relying on direct matrix methods become prohibitively expensive as sample sizes increase. We propose a novel estimation procedure that uses the Lanczos process and stochastic Lanczos quadrature to approximate the likelihood for an initial choice of parameter values. Then, by identifying the variance components parameter space with a family of shifted linear systems ...


Seasonal Warranty Prediction Based On Recurrent Event Data, Qianqian Shan, Yili Hong, William Q. Meeker Jr. Nov 2018

Seasonal Warranty Prediction Based On Recurrent Event Data, Qianqian Shan, Yili Hong, William Q. Meeker Jr.

Statistics Preprints

Warranty return data from repairable systems, such as vehicles, usually result in recurrent event data. The non-homogeneous Poisson process (NHPP) model is used widely to describe such data. Seasonality in the repair frequencies and other variabilities, however, complicate the modeling of recurrent event data. Not much work has been done to address the seasonality, and this paper provides a general approach for the application of NHPP models with dynamic covariates to predict seasonal warranty returns. A hierarchical clustering method is used to stratify the population into groups that are more homogeneous than the than the overall population. The stratification facilitates ...


A Data Set Of Bloodstain Patterns For Teaching And Research In Bloodstain Pattern Analysis: Gunshot Backspatters, Daniel Attinger, Yu Liu, Ricky Faflak, Yalin Rao, Bryce A. Struttman, Kris De Brabanter, Patrick M. Comiskey, Alex L. Yarin Nov 2018

A Data Set Of Bloodstain Patterns For Teaching And Research In Bloodstain Pattern Analysis: Gunshot Backspatters, Daniel Attinger, Yu Liu, Ricky Faflak, Yalin Rao, Bryce A. Struttman, Kris De Brabanter, Patrick M. Comiskey, Alex L. Yarin

Mechanical Engineering Publications

This is a data set of blood spatter patterns scanned at high resolution, generated in controlled experiments. The spatter patterns were generated with a rifle or a handgun, and different ammunitions. The resulting atomized blood droplets travelled opposite to the bullet direction, generating a gunshot backspatter on a poster board target sheet. Fresh blood with anticoagulants was used; its hematocrit and temperature were measured. Main parameters of the study were the bullet shape, size and speed, and the distance between the blood source and target sheet. Several other parameters were explored in a less systematic way. This new and original ...


Dynamics Of Paramagnetic And Ferromagnetic Ellipsoidal Particles In Shear Flow Under A Uniform Magnetic Field, Christopher A. Sobecki, Jie Zhang, Yanzhi Zhang, Cheng Wang Nov 2018

Dynamics Of Paramagnetic And Ferromagnetic Ellipsoidal Particles In Shear Flow Under A Uniform Magnetic Field, Christopher A. Sobecki, Jie Zhang, Yanzhi Zhang, Cheng Wang

Yanzhi Zhang

We investigate the two-dimensional dynamic motion of magnetic particles of ellipsoidal shapes in shear flow under the influence of a uniform magnetic field. In the first part, we present a theoretical analysis of the rotational dynamics of the particles in simple shear flow. By considering paramagnetic and ferromagnetic particles, we study the effects of the direction and strength of the magnetic field on the particle rotation. The critical magnetic-field strength, at which particle rotation is impeded, is determined. In a weak-field regime (i.e., below the critical strength) where the particles execute complete rotations, the symmetry property of the rotational ...


Decoupled, Linear, And Energy Stable Finite Element Method For The Cahn-Hilliard-Navier-Stokes-Darcy Phase Field Model, Yali Gao, Xiaoming He, Liquan Mei, Xiaofeng Yang Nov 2018

Decoupled, Linear, And Energy Stable Finite Element Method For The Cahn-Hilliard-Navier-Stokes-Darcy Phase Field Model, Yali Gao, Xiaoming He, Liquan Mei, Xiaofeng Yang

Xiaoming He

In this paper, we consider the numerical approximation for a phase field model of the coupled two-phase free flow and two-phase porous media flow. This model consists of Cahn—Hilliard—Navier—Stokes equations in the free flow region and Cahn—Hilliard—Darcy equations in the porous media region that are coupled by seven interface conditions. The coupled system is decoupled based on the interface conditions and the solution values on the interface from the previous time step. A fully discretized scheme with finite elements for the spatial discretization is developed to solve the decoupled system. In order to deal with ...


The Impact Of Sample Size In Cross-Classified Multiple Membership Multilevel Models, Hyewon Chung, Jiseon Kim, Ryoungsun Park, Hyeonjeong Jean Nov 2018

The Impact Of Sample Size In Cross-Classified Multiple Membership Multilevel Models, Hyewon Chung, Jiseon Kim, Ryoungsun Park, Hyeonjeong Jean

Journal of Modern Applied Statistical Methods

A simulation study was conducted to examine parameter recovery in a cross-classified multiple membership multilevel model. No substantial relative bias was identified for the fixed effect or level-one variance component estimates. However, the level-two cross-classification multiple membership factor variance components were substantially biased with relatively fewer groups.


An Introduction To Psychological Statistics, Garett C. Foster, David Lane, David Scott, Mikki Hebl, Rudy Guerra, Dan Osherson, Heidi Zimmer Nov 2018

An Introduction To Psychological Statistics, Garett C. Foster, David Lane, David Scott, Mikki Hebl, Rudy Guerra, Dan Osherson, Heidi Zimmer

Open Educational Resources Collection

We are constantly bombarded by information, and finding a way to filter that information in an objective way is crucial to surviving this onslaught with your sanity intact. This is what statistics, and logic we use in it, enables us to do. Through the lens of statistics, we learn to find the signal hidden in the noise when it is there and to know when an apparent trend or pattern is really just randomness. The study of statistics involves math and relies upon calculations of numbers. But it also relies heavily on how the numbers are chosen and how the ...


Analysis Of Ranked Gene Tree Probability Distributions Under The Coalescent Process For Detecting Anomaly Zones, Anastasiia Kim Nov 2018

Analysis Of Ranked Gene Tree Probability Distributions Under The Coalescent Process For Detecting Anomaly Zones, Anastasiia Kim

Shared Knowledge Conference

In phylogenetic studies, gene trees are used to reconstruct species tree. Under the multispecies coalescent model, gene trees topologies may differ from that of species trees. The incorrect gene tree topology (one that does not match the species tree) that is more probable than the correct one is termed anomalous gene tree (AGT). Species trees that can generate such AGTs are said to be in the anomaly zone (AZ). In this region, the method of choosing the most common gene tree as the estimate of the species tree will be inconsistent and will converge to an incorrect species tree when ...


Genome-Wide Analysis Of Alternative Rna Splicing In Children With Acute Myeloid Leukemia (Aml), Xichen Li Nov 2018

Genome-Wide Analysis Of Alternative Rna Splicing In Children With Acute Myeloid Leukemia (Aml), Xichen Li

Shared Knowledge Conference

The pediatric Acute Myeloid Leukemia (AML) is a high-risk and hard-to-treat childhood cancer that originates in the bone marrow from immature white blood cells. Recently, more and more evidence indicates that aberrant splicing of genes is a common characteristic for AML. Gene expression profiles have proved extremely useful for identifying genes that are associated with clinical characteristics and survival outcome of cancer patients. However, conventional gene expression profiles do not account for the differences observed in expressed isoforms when alternative RNA splicing is analyzed. Alternative RNA splicing can generate dozens of distinct transcripts from individual genes and the expressions of ...


Estimators Comparison Of Separable Covariance Structure With One Component As Compound Symmetry Matrix, Katarzyna Filipiak, Daniel Klein, Monika Mokrzycka Nov 2018

Estimators Comparison Of Separable Covariance Structure With One Component As Compound Symmetry Matrix, Katarzyna Filipiak, Daniel Klein, Monika Mokrzycka

Electronic Journal of Linear Algebra

The maximum likelihood estimation (MLE) of separable covariance structure with one component as compound symmetry matrix has been widely studied in the literature. Nevertheless, the proposed estimates are not given in explicit form and can be determined only numerically. In this paper we give an alternative form of MLE and we show that this new algorithm is much quicker than the algorithms given in the literature.\\ Another estimator of covariance structure can be found by minimizing the entropy loss function. In this paper we give three methods of finding the best approximation of separable covariance structure with one component as ...


41 - Data Exploration And Analysis For The Hemingway Measure Of Adult Connectedness, Gildardo Bautista-Maya, Ping Ye, Diane Cook Nov 2018

41 - Data Exploration And Analysis For The Hemingway Measure Of Adult Connectedness, Gildardo Bautista-Maya, Ping Ye, Diane Cook

Georgia Undergraduate Research Conference (GURC)

Abstract:

We analyze the dataset collected from students participating in the Boy With A Ball (BWAB) program, a faith-based community outreach group, through the Hemingway Measure of Adult Connectedness©, a questionnaire measuring the social connectedness of adolescents. First, we approach the data in the conventional method provided by the Hemingway website. We then identify which questions are strong determiners in deciding whether a student has completed the BWAB program or not. With the goal of utilizing the logistic regression, we reduce the set of questions to those only identified as significant in other methods. These methods include linear regression, decision ...


Preface, Weixing Song Nov 2018

Preface, Weixing Song

Conference on Applied Statistics in Agriculture

Preface


Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, Amirfarrokh Iranitalab Nov 2018

Statistical Investigation Of Road And Railway Hazardous Materials Transportation Safety, Amirfarrokh Iranitalab

Civil Engineering Theses, Dissertations, and Student Research

Transportation of hazardous materials (hazmat) in the United States (U.S.) constituted 22.8% of the total tonnage transported in 2012 with an estimated value of more than 2.3 billion dollars. As such, hazmat transportation is a significant economic activity in the U.S. However, hazmat transportation exposes people and environment to the infrequent but potentially severe consequences of incidents resulting in hazmat release. Trucks and trains carried 63.7% of the hazmat in the U.S. in 2012 and are the major foci of this dissertation. The main research objectives were 1) identification and quantification of the effects ...


Probabilities Involving Standard Trirectangular Tetrahedral Dice Rolls, Rulon Olmstead, Doneliezer Baize Oct 2018

Probabilities Involving Standard Trirectangular Tetrahedral Dice Rolls, Rulon Olmstead, Doneliezer Baize

Rose-Hulman Undergraduate Mathematics Journal

The goal is to be able to calculate probabilities involving irregular shaped dice rolls. Here it is attempted to model the probabilities of rolling standard tri-rectangular tetrahedral dice on a hard surface, such as a table top. The vertices and edges of a tetrahedron were projected onto the surface of a sphere centered at the center of mass of the tetrahedron. By calculating the surface areas bounded by the resultant geodesics, baseline probabilities were achieved. Using a 3D printer, dice were constructed of uniform density and the results of rolling them were recorded. After calculating the corresponding confidence intervals, the ...


Analysis Of Covariance (Ancova) In Randomized Trials: More Precision, Less Conditional Bias, And Valid Confidence Intervals, Without Model Assumptions, Bingkai Wang, Elizabeth Ogburn, Michael Rosenblum Oct 2018

Analysis Of Covariance (Ancova) In Randomized Trials: More Precision, Less Conditional Bias, And Valid Confidence Intervals, Without Model Assumptions, Bingkai Wang, Elizabeth Ogburn, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

Covariate adjustment" in the randomized trial context refers to an estimator of the average treatment effect that adjusts for chance imbalances between study arms in baseline variables (called “covariates"). The baseline variables could include, e.g., age, sex, disease severity, and biomarkers. According to two surveys of clinical trial reports, there is confusion about the statistical properties of covariate adjustment. We focus on the ANCOVA estimator, which involves fitting a linear model for the outcome given the treatment arm and baseline variables, and trials with equal probability of assignment to treatment and control. We prove the following new (to the ...


Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song Oct 2018

Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song

Major Papers

In this major paper, we study the influence of structural breaks in the financial market model with high-dimensional data. We present a model which is capable of detecting changes in factor loadings, determining the number of factors and detecting the break date. We consider the case where the break date is both known and unknown and identify the type of instability. For the unknown break date case, we propose a group-LASSO estimator to determine the number of pre- and post-break factors, the break date and the existence of instability of factor loadings when the number of factor is constant. We ...


Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao Oct 2018

Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao

Major Papers

In this major paper, we use high-dimensional models to analyze macroeconomic data which is in influenced by the break point. In particular, we consider to detect the break point and study the changes of the number of factors and the factor loadings with the structural instability.

Concretely, we propose two factor models which explain the processes of pre- and post- break periods. Then, we consider the break point as known or unknown. In both situations, we derive the shrinkage estimators by minimizing the penalized least square function and calculate the estimators of the numbers of pre- and post- break factors ...


On Projection Of A Positive Definite Matrix On A Cone Of Nonnegative Definite Toeplitz Matrices, Katarzyna Filipiak, Augustyn Markiewicz, Adam Mieldzioc, Aneta Sawikowska Oct 2018

On Projection Of A Positive Definite Matrix On A Cone Of Nonnegative Definite Toeplitz Matrices, Katarzyna Filipiak, Augustyn Markiewicz, Adam Mieldzioc, Aneta Sawikowska

Electronic Journal of Linear Algebra

We consider approximation of a given positive definite matrix by nonnegative definite banded Toeplitz matrices. We show that the projection on linear space of Toeplitz matrices does not always preserve nonnegative definiteness. Therefore we characterize a convex cone of nonnegative definite banded Toeplitz matrices which depends on the matrix dimensions, and we show that the condition of positive definiteness given by Parter [{\em Numer. Math. 4}, 293--295, 1962] characterizes the asymptotic cone. In this paper we give methodology and numerical algorithm of the projection basing on the properties of a cone of nonnegative definite Toeplitz matrices. This problem can be ...


Using Cyclical Components To Improve The Forecasts Of The Stock Market And Macroeconomic Variables, Kenneth R. Szulczyk, Shibley Sadique Oct 2018

Using Cyclical Components To Improve The Forecasts Of The Stock Market And Macroeconomic Variables, Kenneth R. Szulczyk, Shibley Sadique

Journal of Modern Applied Statistical Methods

Economic variables such as stock market indices, interest rates, and national output measures contain cyclical components. Forecasting methods excluding these cyclical components yield inaccurate out-of-sample forecasts. Accordingly, a three-stage procedure is developed to estimate a vector autoregression (VAR) with cyclical components. A Monte Carlo simulation shows the procedure estimates the parameters accurately. Subsequently, a VAR with cyclical components improves the root-mean-square error of out-of-sample forecasts by 50% for a stock market model with macroeconomic variables.