Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

10186 Full-Text Articles 13749 Authors 2331137 Downloads 183 Institutions

All Articles in Statistics and Probability

Faceted Search

10186 full-text articles. Page 1 of 279.

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom 2018 Harvey Mudd College

Iterative Matrix Factorization Method For Social Media Data Location Prediction, Natchanon Suaysom

HMC Senior Theses

Since some of the location of where the users posted their tweets collected by social media company have varied accuracy, and some are missing. We want to use those tweets with highest accuracy to help fill in the data of those tweets with incomplete information. To test our algorithm, we used the sets of social media data from a city, we separated them into training sets, where we know all the information, and the testing sets, where we intentionally pretend to not know the location. One prediction method that was used in (Dukler, Han and Wang, 2016) requires appending one-hot ...


Quantifying Certainty: The P-Value, Dominic Klyve 2017 Central Washington University

Quantifying Certainty: The P-Value, Dominic Klyve

Statistics and Probability

No abstract provided.


Optimized Adaptive Enrichment Designs For Multi-Arm Trials: Learning Which Subpopulations Benefit From Different Treatments, Jon Arni Steingrimsson, Joshua Betz, Tiachen Qian, Michael Rosenblum 2017 Department of Biostatistics, Brown School of Public Health

Optimized Adaptive Enrichment Designs For Multi-Arm Trials: Learning Which Subpopulations Benefit From Different Treatments, Jon Arni Steingrimsson, Joshua Betz, Tiachen Qian, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

We propose a class of adaptive randomized trial designs for comparing two treatments to a common control in two disjoint subpopulations. The type of adaptation, called adaptive enrichment, involves a preplanned rule for modifying enrollment and arm assignment based on accruing data in an ongoing trial. The motivation for this adaptive feature is that interim data may indicate that a subpopulation, such as those with lower disease severity at baseline, are unlikely to benefit from a particular treatment, while uncertainty remains for the other treatment and/or subpopulation. We developed a new multiple testing procedure tailored to this design problem ...


Comparison Of Adaptive Randomized Trial Designs For Time-To-Event Outcomes That Expand Versus Restrict Enrollment Criteria, To Test Non-Inferiority, Josh Betz, Jon Arni Steingrimsson, Tianchen Qian, Michael Rosenblum 2017 Johns Hopkins Bloomberg School of Public Health, Department of Biostatistics

Comparison Of Adaptive Randomized Trial Designs For Time-To-Event Outcomes That Expand Versus Restrict Enrollment Criteria, To Test Non-Inferiority, Josh Betz, Jon Arni Steingrimsson, Tianchen Qian, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

Adaptive enrichment designs involve preplanned rules for modifying patient enrollment criteria based on data accrued in an ongoing trial. These designs may be useful when it is suspected that a subpopulation, e.g., defined by a biomarker or risk score measured at baseline, may benefit more from treatment than the complementary subpopulation. We compare two types of such designs, for the case of two subpopulations that partition the overall population. The first type starts by enrolling the subpopulation where it is suspected the new treatment is most likely to work, and then may expand inclusion criteria if there is early ...


Using Ranked Auxiliary Covariate As A More Efficient Sampling Design For Ancova Model: Analysis Of A Psychological Intervention To Buttress Resilience, Rajai Jabrah, Hani Samawi, Robert Vogel, Haresh Rochani, Daniel Linder 2017 Georgia Southern University

Using Ranked Auxiliary Covariate As A More Efficient Sampling Design For Ancova Model: Analysis Of A Psychological Intervention To Buttress Resilience, Rajai Jabrah, Hani Samawi, Robert Vogel, Haresh Rochani, Daniel Linder

Hani M. Samawi

Drawing a sample can be costly or time consuming in some studies. However, it may be possible to rank the sampling units according to some baseline auxiliary covariates, which are easily obtainable, and/or cost efficient. Ranked set sampling (RSS) is a method to achieve this goal. In this paper, we propose a modified approach of the RSS method to allocate units into an experimental study that compares L groups. Computer simulation estimates the empirical nominal values and the empirical power values for the test procedure of comparing L different groups using modified RSS based on the regression approach in ...


Inference On Overlapping Coefficients In Two Exponential Populations, Mohammad F. Al-Saleh, Hani M. Samawi 2017 Yarmouk University

Inference On Overlapping Coefficients In Two Exponential Populations, Mohammad F. Al-Saleh, Hani M. Samawi

Hani M. Samawi

Three measures of overlap, namely Matusita’s measureρ , Morisita’s measure λ and Weitzman’s measure Δ are investigated in this article for two exponential populations with different means. It is well that the estimators of those measures of overlap are biased. The bias is of these estimators depends on the unknown overlap parameters. There are no closed-form, exact formulas, for those estimators variances or their exact sampling distributions. Monte Carlo evaluations are used to study the bias and precision of the proposed overlap measures. Bootstrap method and Taylor series approximation are used to construct confidence intervals for the overlap ...


Evaluating The Efficiency Of Treatment Comparison In Crossover Design By Allocating Subjects Based On Ranked Auxiliary Variable, Yisong Huang, Hani Samawi, Robert Vogel, Jingjing Yin, Worlanyo E. Gato, Daniel Linder 2017 Georgia Southern University

Evaluating The Efficiency Of Treatment Comparison In Crossover Design By Allocating Subjects Based On Ranked Auxiliary Variable, Yisong Huang, Hani Samawi, Robert Vogel, Jingjing Yin, Worlanyo E. Gato, Daniel Linder

Hani M. Samawi

The validity of statistical inference depends on proper randomization methods. However, even with proper randomization, we can have imbalanced with respect to important characteristics. In this paper, we introduce a method based on ranked auxiliary variables for treatment allocation in crossover designs using Latin squares models. We evaluate the improvement of the efficiency in treatment comparisons using the proposed method. Our simulation study reveals that our proposed method provides a more powerful test compared to simple randomization with the same sample size. The proposed method is illustrated by conducting an experiment to compare two different concentrations of titanium dioxide nanofiber ...


Estimation Of P(X > Y) When X And Y Are Dependent Random Variables Using Different Bivariate Sampling Schemes, Hani M. Samawi, Amal Helu, Haresh Rochani, Jingjing Yin, Daniel Linder 2017 Georgia Southern University

Estimation Of P(X > Y) When X And Y Are Dependent Random Variables Using Different Bivariate Sampling Schemes, Hani M. Samawi, Amal Helu, Haresh Rochani, Jingjing Yin, Daniel Linder

Hani M. Samawi

The stress-strength models have been intensively investigated in the literature in regards of estimating the reliability θ = P (X > Y) using parametric and nonparametric approaches under different sampling schemes when X and Y are independent random variables. In this paper, we consider the problem of estimating θ when (X, Y) are dependent random variables with a bivariate underlying distribution. The empirical and kernel estimates of θ = P (X > Y), based on bivariate ranked set sampling (BVRSS) are considered, when (X, Y) are paired dependent continuous random variables. The estimators obtained are compared to their counterpart, bivariate simple random sampling (BVSRS ...


Correction Of Verication Bias Using Log-Linear Models For A Single Binaryscale Diagnostic Tests, Haresh Rochani, Hani M. Samawi, Robert L. Vogel, Jingjing Yin 2017 Georgia Southern University

Correction Of Verication Bias Using Log-Linear Models For A Single Binaryscale Diagnostic Tests, Haresh Rochani, Hani M. Samawi, Robert L. Vogel, Jingjing Yin

Hani M. Samawi

In diagnostic medicine, the test that determines the true disease status without an error is referred to as the gold standard. Even when a gold standard exists, it is extremely difficult to verify each patient due to the issues of costeffectiveness and invasive nature of the procedures. In practice some of the patients with test results are not selected for verification of the disease status which results in verification bias for diagnostic tests. The ability of the diagnostic test to correctly identify the patients with and without the disease can be evaluated by measures such as sensitivity, specificity and predictive ...


Prevalence And Trends In Transmitted And Acquired Antiretroviral Drug Resistance, Washington, Dc, 1999-2014., Annette M Aldous, Amanda D Castel, David M Parenti 2017 George Washington University

Prevalence And Trends In Transmitted And Acquired Antiretroviral Drug Resistance, Washington, Dc, 1999-2014., Annette M Aldous, Amanda D Castel, David M Parenti

Epidemiology and Biostatistics Faculty Publications

Background

Drug resistance limits options for antiretroviral therapy (ART) and results in poorer health outcomes among HIV-infected persons. We sought to characterize resistance patterns and to identify predictors of resistance in Washington, DC.

Methods

We analyzed resistance in the DC Cohort, a longitudinal study of HIV-infected persons in care in Washington, DC. We measured cumulative drug resistance (CDR) among participants with any genotype between 1999 and 2014 (n = 3411), transmitted drug resistance (TDR) in ART-naïve persons (n = 1503), and acquired drug resistance (ADR) in persons with genotypes before and after ART initiation (n = 309). Using logistic regression, we assessed associations ...


The University Of Iowa Biomass Energy Sustainability Index: A Decision-Making Tool For The University Of Iowa Biomass Partnership Project, Liz Christiansen, Ingrid Gronstal Anderson, Ferman Milster, Sara Maples, Aaron Strong, Adam Ward, Eric Tate, Tyler Priest, Emily A. Heaton, Lisa A. Schulte Moore, Richard B. Hall, John Tyndall, Maeraj Hafiz Sheikh, Daryl Smith 2017 University of Iowa

The University Of Iowa Biomass Energy Sustainability Index: A Decision-Making Tool For The University Of Iowa Biomass Partnership Project, Liz Christiansen, Ingrid Gronstal Anderson, Ferman Milster, Sara Maples, Aaron Strong, Adam Ward, Eric Tate, Tyler Priest, Emily A. Heaton, Lisa A. Schulte Moore, Richard B. Hall, John Tyndall, Maeraj Hafiz Sheikh, Daryl Smith

Daryl Smith

Work continued on a plan to increase the renewable, sustainable fuel sources available to power operations at the University of Iowa in Iowa City. A team of researchers from multiple institutions collaborated to create a tool that would allow the UI to evaluate its alternative energy options more effectively.


A Hierarchical Bayesian Approach To Distinguishing Serial And Parallel Processing, Joseph W. Houpt, Mario Fifić 2017 Wright State University - Main Campus

A Hierarchical Bayesian Approach To Distinguishing Serial And Parallel Processing, Joseph W. Houpt, Mario Fifić

Joseph W. Houpt

Research in cognitive psychology often focuses on how people deal with multiple sources of information. One important aspect of this research is whether people use the information in parallel (at the same time) or in series (one at a time). Various approaches to distinguishing parallel and serial processing have been proposed, but many do not satisfactorily address the mimicking dilemma between serial and parallel classes of models. The mean interaction contrast (MIC) is one measure is designed to improve discriminability of serial-parallel model properties. The MIC has been applied in limited settings because the measure required a large number of ...


Nonparametric Variable Importance Assessment Using Machine Learning Techniques, Brian D. Williamson, Peter B. Gilbert, Noah Simon, Marco Carone 2017 Department of Biostatistics, University of Washington

Nonparametric Variable Importance Assessment Using Machine Learning Techniques, Brian D. Williamson, Peter B. Gilbert, Noah Simon, Marco Carone

UW Biostatistics Working Paper Series

In a regression setting, it is often of interest to quantify the importance of various features in predicting the response. Commonly, the variable importance measure used is determined by the regression technique employed. For this reason, practitioners often only resort to one of a few regression techniques for which a variable importance measure is naturally defined. Unfortunately, these regression techniques are often sub-optimal for predicting response. Additionally, because the variable importance measures native to different regression techniques generally have a different interpretation, comparisons across techniques can be difficult. In this work, we study a novel variable importance measure that can ...


Arca Controls Metabolism, Chemotaxis, And Motility Contributing To The Pathogenicity Of Avian Pathogenic Escherichia Coli, Fengwei Jiang, Chunxia An, Yinli Bao, Xuefeng Zhao, Robert L. Jernigan, Andrew Lithio, Dan Nettleton, Ling Li, Eve S. Wurtele, Lisa K. Nolan, Chengping Lu, Ganwu Li 2017 Nanjing Agricultural University

Arca Controls Metabolism, Chemotaxis, And Motility Contributing To The Pathogenicity Of Avian Pathogenic Escherichia Coli, Fengwei Jiang, Chunxia An, Yinli Bao, Xuefeng Zhao, Robert L. Jernigan, Andrew Lithio, Dan Nettleton, Ling Li, Eve S. Wurtele, Lisa K. Nolan, Chengping Lu, Ganwu Li

Robert Jernigan

Avian pathogenic Escherichia coli (APEC) strains cause one of the three most significant infectious diseases in the poultry industry and are also potential food-borne pathogens threating human health. In this study, we showed that ArcA (aerobic respiratory control), a global regulator important for E. coli's adaptation from anaerobic to aerobic conditions and control of that bacterium's enzymatic defenses against reactive oxygen species (ROS), is involved in the virulence of APEC. Deletion of arcA significantly attenuates the virulence of APEC in the duck model. Transcriptome sequencing (RNA-Seq) analyses comparing the APEC wild type and the arcA mutant indicate that ...


Time Series Copulas For Heteroskedastic Data, Ruben Loaiza-Maya, Michael S. Smith, Worapree Maneesoonthorn 2017 Melbourne Business School

Time Series Copulas For Heteroskedastic Data, Ruben Loaiza-Maya, Michael S. Smith, Worapree Maneesoonthorn

Michael Stanley Smith

We propose parametric copulas that capture serial dependence in stationary heteroskedastic time series. We develop our copula for first order Markov series, and extend it to higher orders and multivariate series. We derive the copula of a volatility proxy, based on which we propose new measures of volatility dependence, including co-movement and spillover in multivariate series. In general, these depend upon the marginal distributions of the series. Using exchange rate returns, we show that the resulting copula models can capture their marginal distributions more accurately than univariate and multivariate GARCH models, and produce more accurate value at risk forecasts.


Index Number Of Iowa Farm Products Prices, Gertrude M. Cox 2017 Iowa State College

Index Number Of Iowa Farm Products Prices, Gertrude M. Cox

Bulletin (Iowa Agricultural Experiment Station)

The present Iowa farm price index has been in use since 1926. It is widely employed as a measure of the general level of Iowa farm prices and appears each month in the price barometer published in Agricultural Economic Facts2. A few years ago Peck3 developed a farm lease, known as the sliding scale lease, in which the rental payments are based on and vary with the changes in the index number. More recently, contracts covering land sales have been devised in which the interest payments and in some cases also the principal payments are based on this ...


Annuity Product Valuation And Risk Measurement Under Correlated Financial And Longevity Risks, Soohong Park 2017 The University of Western Ontario

Annuity Product Valuation And Risk Measurement Under Correlated Financial And Longevity Risks, Soohong Park

Electronic Thesis and Dissertation Repository

Longevity risk is a non-diversifiable risk and regarded as a pressing socio-economic challenge of the century. Its accurate assessment and quantification is therefore critical to enable pension-fund companies provide sustainable old-age security and maintain a resilient global insurance market. Fluctuations and a decreasing trend in mortality rates, which give rise to longevity risk, as well as the uncertainty in interest-rate dynamics constitute the two fundamental determinants in pricing and risk management of longevity-dependent products. We also note that historical data reveal some evidence of strong correlation between mortality and interest rates and must be taken into account when modelling their ...


Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek 2017 Stephen F Austin State University

Examination And Comparison Of The Performance Of Common Non-Parametric And Robust Regression Models, Gregory F. Malek

Electronic Theses and Dissertations

ABSTRACT

Examination and Comparison of the Performance of Common Non-Parametric and Robust Regression Models

By

Gregory Frank Malek

Stephen F. Austin State University, Masters in Statistics Program,

Nacogdoches, Texas, U.S.A.

g_m_2002@live.com

This work investigated common alternatives to the least-squares regression method in the presence of non-normally distributed errors. An initial literature review identified a variety of alternative methods, including Theil Regression, Wilcoxon Regression, Iteratively Re-Weighted Least Squares, Bounded-Influence Regression, and Bootstrapping methods. These methods were evaluated using a simple simulated example data set, as well as various real data sets, including math proficiency data, Belgian telephone ...


The Soybean Rhg1 Locus For Resistance To The Soybean Cyst Nematode Heterodera Glycines Regulates The Expression Of A Large Number Of Stress- And Defense-Related Genes In Degenerating Feeding Cells, Pramod Kaitheri Kandoth, Nagabhushana Ithal, Justin Recknor, Tom Maier, Dan Nettleton, Thomas J. Baum, Melissa G. Mitchum 2017 University of Missouri

The Soybean Rhg1 Locus For Resistance To The Soybean Cyst Nematode Heterodera Glycines Regulates The Expression Of A Large Number Of Stress- And Defense-Related Genes In Degenerating Feeding Cells, Pramod Kaitheri Kandoth, Nagabhushana Ithal, Justin Recknor, Tom Maier, Dan Nettleton, Thomas J. Baum, Melissa G. Mitchum

Thomas Baum

To gain new insights into the mechanism of soybean (Glycine max) resistance to the soybean cyst nematode (Heterodera glycines), we compared gene expression profiles of developing syncytia in soybean near-isogenic lines differing at Rhg1 (for resistance to Heterodera glycines), a major quantitative trait locus for resistance, by coupling laser capture microdissection with microarray analysis. Gene expression profiling revealed that 1,447 genes were differentially expressed between the two lines. Of these, 241 (16.8%) were stress- and defense-related genes. Several stress-related genes were up-regulated in the resistant line, including those encoding homologs of enzymes that lead to increased levels of ...


Improving The Accuracy For The Long-Term Hydrologic Impact Assessment (L-Thia) Model, Anqi Zhang, Lawrence Theller, Bernard A. Engel 2017 Purdue University

Improving The Accuracy For The Long-Term Hydrologic Impact Assessment (L-Thia) Model, Anqi Zhang, Lawrence Theller, Bernard A. Engel

The Summer Undergraduate Research Fellowship (SURF) Symposium

Urbanization increases runoff by changing land use types from less impervious to impervious covers. Improving the accuracy of a runoff assessment model, the Long-Term Hydrologic Impact Assessment (L-THIA) Model, can help us to better evaluate the potential uses of Low Impact Development (LID) practices aimed at reducing runoff, as well as to identify appropriate runoff and water quality mitigation methods. Several versions of the model have been built over time, and inconsistencies have been introduced between the models. To improve the accuracy and consistency of the model, the equations and parameters (primarily curve numbers in the case of this model ...


Digital Commons powered by bepress