Open Access. Powered by Scholars. Published by Universities.®

Statistical Methodology Commons

Open Access. Powered by Scholars. Published by Universities.®

2017

Discipline
Institution
Keyword
Publication
Publication Type

Articles 31 - 50 of 50

Full-Text Articles in Statistical Methodology

Inference From Network Data In Hard-To-Reach Populations, Isabelle Beaudry Mar 2017

Inference From Network Data In Hard-To-Reach Populations, Isabelle Beaudry

Doctoral Dissertations

The objective of this thesis is to develop methods to make inference about the prevalence of an outcome of interest in hard-to-reach populations. The proposed methods address issues specific to the survey strategies employed to access those populations. One of the common sampling methodology used in this context is respondent-driven sampling (RDS). Under RDS, the network connecting members of the target population is used to uncover the hidden members. Specialized techniques are then used to make inference from the data collected in this fashion. Our first objective is to correct traditional RDS prevalence estimators and their associated uncertainty estimators for …


Session D-5: Informal Comparative Inference: What Is It?, Karen Togliatti Mar 2017

Session D-5: Informal Comparative Inference: What Is It?, Karen Togliatti

Professional Learning Day

Come and experience a hands-on task that has middle-school students grapple with informal inferential reasoning. Three key principles of informal inference – data as evidence, probabilistic language, and generalizing ‘beyond the data’ will be discussed as students build and analyze distributions to answer the question, “Does hand dominance play a role in throwing accuracy?” Connections to the CCSSM statistics standards for middle-school will be highlighted.


Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen Feb 2017

Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen

Laura B. Balzer

WHO guidelines call for universal antiretroviral treatment, and UNAIDS has set a global target to virally suppress most HIV-positive individuals. Accurate estimates of population-level coverage at each step of the HIV care cascade (testing, treatment, and viral suppression) are needed to assess the effectiveness of "test and treat" strategies implemented to achieve this goal. The data available to inform such estimates, however, are susceptible to informative missingness: the number of HIV-positive individuals in a population is unknown; individuals tested for HIV may not be representative of those whom a testing intervention fails to reach, and HIV-positive individuals with a viral …


Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen Feb 2017

Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

WHO guidelines call for universal antiretroviral treatment, and UNAIDS has set a global target to virally suppress most HIV-positive individuals. Accurate estimates of population-level coverage at each step of the HIV care cascade (testing, treatment, and viral suppression) are needed to assess the effectiveness of "test and treat" strategies implemented to achieve this goal. The data available to inform such estimates, however, are susceptible to informative missingness: the number of HIV-positive individuals in a population is unknown; individuals tested for HIV may not be representative of those whom a testing intervention fails to reach, and HIV-positive individuals with a viral …


It's All About Balance: Propensity Score Matching In The Context Of Complex Survey Data, David Lenis, Trang Q. ;Nguyen, Nian Dong, Elizabeth A. Stuart Feb 2017

It's All About Balance: Propensity Score Matching In The Context Of Complex Survey Data, David Lenis, Trang Q. ;Nguyen, Nian Dong, Elizabeth A. Stuart

Johns Hopkins University, Dept. of Biostatistics Working Papers

Many research studies aim to draw causal inferences using data from large, nationally representative survey samples, and many of these studies use propensity score matching to make those causal inferences as rigorous as possible given the non-experimental nature of the data. However, very few applied studies are careful about incorporating the survey design with the propensity score analysis, which may mean that the results don’t generate population inferences. This may be because few methodological studies examine how to best combine these methods. Furthermore, even fewer of the methodological studies incorporate different non-response mechanisms in their analysis. This study examines methods …


Further Advances For The Sequential Multiple Assignment Randomized Trial (Smart), Tianjiao Dai Feb 2017

Further Advances For The Sequential Multiple Assignment Randomized Trial (Smart), Tianjiao Dai

Dissertations & Theses (Open Access)

ABSTRACT

FURTHER ADVANCES FOR THE SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZED TRIAL (SMART)

Tianjiao Dai, M.S.

Advisory Professor: Sanjay Shete, Ph.D.

Sequential multiple assignment randomized trial (SMART) designs have been developed these years for studying adaptive interventions. In my Ph.D. study, I mainly investigate how to further improve SMART designs and optimize the interventions for each individual in the trial. My dissertation has focused on two topics of SMART designs.

1) Developing a novel SMART design that can reduce the cost and side effects associated with the interventions and proposing the corresponding analytic methods. I have developed a time-varying SMART design in …


On The Three Dimensional Interaction Between Flexible Fibers And Fluid Flow, Bogdan Nita, Ryan Allaire Jan 2017

On The Three Dimensional Interaction Between Flexible Fibers And Fluid Flow, Bogdan Nita, Ryan Allaire

Department of Mathematics Facuty Scholarship and Creative Works

In this paper we discuss the deformation of a flexible fiber clamped to a spherical body and immersed in a flow of fluid moving with a speed ranging between 0 and 50 cm/s by means of three dimensional numerical simulation developed in COMSOL . The effects of flow speed and initial configuration angle of the fiber relative to the flow are analyzed. A rigorous analysis of the numerical procedure is performed and our code is benchmarked against well established cases. The flow velocity and pressure are used to compute drag forces upon the fiber. Of particular interest is the behavior …


What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg Jan 2017

What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg

Publications and Research

We believe that students learn best, are actively engaged, and are genuinely interested when working on real-world problems. This can be done by giving students the opportunity to work collaboratively on projects that investigate authentic, familiar problems. This article shares one such project that was used in an introductory statistics course. We describe the steps taken to investigate why customers are charged more for iced coffee than hot coffee, which included collecting data and using descriptive and inferential statistical analysis. Interspersed throughout the article, we describe strategies that can help teachers implement the project and scaffold material to assist students …


Approximate Bayesian Computation In Forensic Science, Jessie H. Hendricks Jan 2017

Approximate Bayesian Computation In Forensic Science, Jessie H. Hendricks

The Journal of Undergraduate Research

Forensic evidence is often an important factor in criminal investigations. Analyzing evidence in an objective way involves the use of statistics. However, many evidence types (i.e., glass fragments, fingerprints, shoe impressions) are very complex. This makes the use of statistical methods, such as model selection in Bayesian inference, extremely difficult.

Approximate Bayesian Computation is an algorithmic method in Bayesian analysis that can be used for model selection. It is especially useful because it can be used to assign a Bayes Factor without the need to directly evaluate the exact likelihood function - a difficult task for complex data. Several criticisms …


Variance Prior Specification For A Basket Trial Design Using Bayesian Hierarchical Modeling, Kristen Cunanan, Alexia Iasonos, Ronglai Shen, Mithat Gonen Jan 2017

Variance Prior Specification For A Basket Trial Design Using Bayesian Hierarchical Modeling, Kristen Cunanan, Alexia Iasonos, Ronglai Shen, Mithat Gonen

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Background: In the era of targeted therapies, clinical trials in oncology are rapidly evolving, wherein patients from multiple diseases are now enrolled and treated according to their genomic mutation(s). In such trials, known as basket trials, the different disease cohorts form the different baskets for inference. Several approaches have been proposed in the literature to efficiently use information from all baskets while simultaneously screening to find individual baskets where the drug works. Most proposed methods are developed in a Bayesian paradigm that requires specifying a prior distribution for a variance parameter, which controls the degree to which information is shared …


Optimized Variable Selection Via Repeated Data Splitting, Marinela Capanu, Colin B. Begg, Mithat Gonen Jan 2017

Optimized Variable Selection Via Repeated Data Splitting, Marinela Capanu, Colin B. Begg, Mithat Gonen

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

We introduce a new variable selection procedure that repeatedly splits the data into two sets, one for estimation and one for validation, to obtain an empirically optimized threshold which is then used to screen for variables to include in the final model. Simulation results show that the proposed variable selection technique enjoys superior performance compared to candidate methods, being amongst those with the lowest inclusion of noisy predictors while having the highest power to detect the correct model and being unaffected by correlations among the predictors. We illustrate the methods by applying them to a cohort of patients undergoing hepatectomy …


Bayesian Exponential Random Graph Modelling Of Interhospital Patient Referral Networks, Alberto Caimo, Francesca Pallotti, Alessandro Lomi Jan 2017

Bayesian Exponential Random Graph Modelling Of Interhospital Patient Referral Networks, Alberto Caimo, Francesca Pallotti, Alessandro Lomi

Articles

Using original data that we have collected on referral relations between 110 hospitals serving a large regional community, we show how recently derived Bayesian exponential random graph models may be adopted to illuminate core empirical issues in research on relational coordination among healthcare organisations. We show how a rigorous Bayesian computation approach supports a fully probabilistic analytical framework that alleviates well-known problems in the estimation of model parameters of exponential random graph models. We also show how the main structural features of interhospital patient referral networks that prior studies have described can be reproduced with accuracy by specifying the system …


Machine Learning On Statistical Manifold, Bo Zhang Jan 2017

Machine Learning On Statistical Manifold, Bo Zhang

HMC Senior Theses

This senior thesis project explores and generalizes some fundamental machine learning algorithms from the Euclidean space to the statistical manifold, an abstract space in which each point is a probability distribution. In this thesis, we adapt the optimal separating hyperplane, the k-means clustering method, and the hierarchical clustering method for classifying and clustering probability distributions. In these modifications, we use the statistical distances as a measure of the dissimilarity between objects. We describe a situation where the clustering of probability distributions is needed and useful. We present many interesting and promising empirical clustering results, which demonstrate the statistical-distance-based clustering algorithms …


Comparing The Structural Components Variance Estimator And U-Statistics Variance Estimator When Assessing The Difference Between Correlated Aucs With Finite Samples, Anna L. Bosse Jan 2017

Comparing The Structural Components Variance Estimator And U-Statistics Variance Estimator When Assessing The Difference Between Correlated Aucs With Finite Samples, Anna L. Bosse

Theses and Dissertations

Introduction: The structural components variance estimator proposed by DeLong et al. (1988) is a popular approach used when comparing two correlated AUCs. However, this variance estimator is biased and could be problematic with small sample sizes.

Methods: A U-statistics based variance estimator approach is presented and compared with the structural components variance estimator through a large-scale simulation study under different finite-sample size configurations.

Results: The U-statistics variance estimator was unbiased for the true variance of the difference between correlated AUCs regardless of the sample size and had lower RMSE than the structural components variance estimator, providing better type 1 error …


On The Equivalence Between Bayesian And Frequentist Nonparametric Hypothesis Testing, Qiuchen Hai Jan 2017

On The Equivalence Between Bayesian And Frequentist Nonparametric Hypothesis Testing, Qiuchen Hai

Dissertations, Master's Theses and Master's Reports

Testing of hypotheses about the population parameter is one of the most fundamental tasks in the empirical sciences and is often conducted by using parametric tests (e.g., the t-test and F-test), in which they assume that the samples are from populations that are normally distributed. When the normality assumption is violated, nonparametric tests are employed as alternatives for making statistical inference. In recent years, the Bayesian versions of parametric tests have been well studied in the literature, whereas in contrast, the Bayesian versions of nonparametric tests are quite scant (for exception, Yuan and Johnson (2008) ) in the literature, mainly …


Teaching Size And Power Properties Of Hypothesis Tests Through Simulations, Suleyman Taspinar, Osman Dogan Jan 2017

Teaching Size And Power Properties Of Hypothesis Tests Through Simulations, Suleyman Taspinar, Osman Dogan

Publications and Research

In this study, we review the graphical methods suggested in Davidson and MacKinnon (Davidson, Russell, and James G. MacKinnon. 1998. “Graphical Methods for Investigating the Size and Power of Hypothesis Tests.” The Manchester School 66 (1): 1–26.) that can be used to investigate size and power properties of hypothesis tests for undergraduate and graduate econometrics courses. These methods can be used to assess finite sample properties of various hypothesis tests through simulation studies. In addition, these methods can be effectively used in classrooms to reinforce students’ understanding of basic hypothesis testing concepts such as Type I error, Type II error, …


Inference Using Bhattacharyya Distance To Model Interaction Effects When The Number Of Predictors Far Exceeds The Sample Size, Sarah A. Janse Jan 2017

Inference Using Bhattacharyya Distance To Model Interaction Effects When The Number Of Predictors Far Exceeds The Sample Size, Sarah A. Janse

Theses and Dissertations--Statistics

In recent years, statistical analyses, algorithms, and modeling of big data have been constrained due to computational complexity. Further, the added complexity of relationships among response and explanatory variables, such as higher-order interaction effects, make identifying predictors using standard statistical techniques difficult. These difficulties are only exacerbated in the case of small sample sizes in some studies. Recent analyses have targeted the identification of interaction effects in big data, but the development of methods to identify higher-order interaction effects has been limited by computational concerns. One recently studied method is the Feasible Solutions Algorithm (FSA), a fast, flexible method that …


Informational Index And Its Applications In High Dimensional Data, Qingcong Yuan Jan 2017

Informational Index And Its Applications In High Dimensional Data, Qingcong Yuan

Theses and Dissertations--Statistics

We introduce a new class of measures for testing independence between two random vectors, which uses expected difference of conditional and marginal characteristic functions. By choosing a particular weight function in the class, we propose a new index for measuring independence and study its property. Two empirical versions are developed, their properties, asymptotics, connection with existing measures and applications are discussed. Implementation and Monte Carlo results are also presented.

We propose a two-stage sufficient variable selections method based on the new index to deal with large p small n data. The method does not require model specification and especially focuses …


Nonparametric Compound Estimation, Derivative Estimation, And Change Point Detection, Sisheng Liu Jan 2017

Nonparametric Compound Estimation, Derivative Estimation, And Change Point Detection, Sisheng Liu

Theses and Dissertations--Statistics

Firstly, we reviewed some popular nonparameteric regression methods during the past several decades. Then we extended the compound estimation (Charnigo and Srinivasan [2011]) to adapt random design points and heteroskedasticity and proposed a modified Cp criteria for tuning parameter selection. Moreover, we developed a DCp criteria for tuning paramter selection problem in general nonparametric derivative estimation. This extends GCp criteria in Charnigo, Hall and Srinivasan [2011] with random design points and heteroskedasticity. Next, we proposed a change point detection method via compound estimation for both fixed design and random design case, the adaptation of heteroskedasticity was considered for the method. …


Penalized Nonparametric Scalar-On-Function Regression Via Principal Coordinates, Philip T. Reiss, David L. Miller, Pei-Shien Wu, Wen-Yu Hua Dec 2016

Penalized Nonparametric Scalar-On-Function Regression Via Principal Coordinates, Philip T. Reiss, David L. Miller, Pei-Shien Wu, Wen-Yu Hua

Philip T. Reiss

A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. The core idea is to regress the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, the proposed …