Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Series

2012

Institution
Keyword
Publication

Articles 1 - 30 of 209

Full-Text Articles in Physical Sciences and Mathematics

Multiple Subject Barycentric Discriminant Analysis (Musubada): How To Assign Scans To Categories Without Using Spatial Normalization, Hervé Abdi, Lynne J. Williams, Andrew C. Connolly, M. Ida Gobbini Dec 2012

Multiple Subject Barycentric Discriminant Analysis (Musubada): How To Assign Scans To Categories Without Using Spatial Normalization, Hervé Abdi, Lynne J. Williams, Andrew C. Connolly, M. Ida Gobbini

Dartmouth Scholarship

We present a new discriminant analysis (DA) method called Multiple Subject Barycentric Discriminant Analysis (MUSUBADA) suited for analyzing fMRI data because it handles datasets with multiple participants that each provides different number of variables (i.e., voxels) that are themselves grouped into regions of interest (ROIs). Like DA, MUSUBADA (1) assigns observations to predefined categories, (2) gives factorial maps displaying observations and categories, and (3) optimally assigns observations to categories. MUSUBADA handles cases with more variables than observations and can project portions of the data table (e.g., subtables, which can represent participants or ROIs) on the factorial maps. Therefore MUSUBADA can …


Optimal Spatial Prediction Using Ensemble Machine Learning, Molly M. Davies, Mark J. Van Der Laan Dec 2012

Optimal Spatial Prediction Using Ensemble Machine Learning, Molly M. Davies, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Spatial prediction is an important problem in many scientific disciplines. Super Learner is an ensemble prediction approach related to stacked generalization that uses cross-validation to search for the optimal predictor amongst all convex combinations of a heterogeneous candidate set. It has been applied to non-spatial data, where theoretical results demonstrate it will perform asymptotically at least as well as the best candidate under consideration. We review these optimality properties and discuss the assumptions required in order for them to hold for spatial prediction problems. We present results of a simulation study confirming Super Learner works well in practice under a …


A Modular Mind? A Test Using Individual Data From Seven Primate Species, Federica Amici, Bradley Barney, Valen E. Johnson, Josep Call, Filippo Aureli Dec 2012

A Modular Mind? A Test Using Individual Data From Seven Primate Species, Federica Amici, Bradley Barney, Valen E. Johnson, Josep Call, Filippo Aureli

Faculty and Research Publications

It has long been debated whether the mind consists of specialized and independently evolving modules, or whether and to what extent a general factor accounts for the variance in performance across different cognitive domains. In this study, we used a hierarchical Bayesian model to re-analyse individual level data collected on seven primate species (chimpanzees, bonobos, orangutans, gorillas, spider monkeys, brown capuchin monkeys and long-tailed macaques) across 17 tasks within four domains (inhibition, memory, transposition and support). Our modelling approach evidenced the existence of both a domain-specific factor and a species factor, each accounting for the same amount (17%) of the …


Relating Nanoparticle Properties To Biological Outcomes In Exposure Escalation Experiments, Trina Patel, Cecile Low-Kam, Zhaoxia Ji, Haiyuan Zhang, Tian Xia, Andre E. Nel, Jeffrey I. Zinc, Donatello Telesca Dec 2012

Relating Nanoparticle Properties To Biological Outcomes In Exposure Escalation Experiments, Trina Patel, Cecile Low-Kam, Zhaoxia Ji, Haiyuan Zhang, Tian Xia, Andre E. Nel, Jeffrey I. Zinc, Donatello Telesca

COBRA Preprint Series

A fundamental goal in nano-toxicology is that of identifying particle physical and chemical properties, which are likely to explain biological hazard. The first line of screening for potentially adverse outcomes often consists of exposure escalation experiments, involving the exposure of micro-organisms or cell lines to a battery of nanomaterials. We discuss a modeling strategy, that relates the outcome of an exposure escalation experiment to nanoparticle properties. Our approach makes use of a hierarchical decision process, where we jointly identify particles that initiate adverse biological outcomes and explain the probability of this event in terms of the particle physico-chemical descriptors. The …


A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman Dec 2012

A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman

UW Biostatistics Working Paper Series

Many cohort studies in environmental epidemiology require accurate modeling and prediction of fine scale spatial variation in ambient air quality across the U.S. This modeling requires the use of small spatial scale geographic or “land use” regression covariates and some degree of spatial smoothing. Furthermore, the details of the prediction of air quality by land use regression and the spatial variation in ambient air quality not explained by this regression should be allowed to vary across the continent due to the large scale heterogeneity in topography, climate, and sources of air pollution. This paper introduces a regionalized national universal kriging …


Sensitivity Analysis For Causal Inference Under Unmeasured Confounding And Measurement Error Problems, Iván Díaz, Mark J. Van Der Laan Dec 2012

Sensitivity Analysis For Causal Inference Under Unmeasured Confounding And Measurement Error Problems, Iván Díaz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper we present a sensitivity analysis for drawing inferences about parameters that are not estimable from observed data without additional assumptions. We present the methodology using two different examples: a causal parameter that is not identifiable due to violations of the randomization assumption, and a parameter that is not estimable in the nonparametric model due to measurement error. Existing methods for tackling these problems assume a parametric model for the type of violation to the identifiability assumption, and require the development of new estimators and inference for every new model. The method we present can be used in …


Computationally Efficient Confidence Intervals For Cross-Validated Area Under The Roc Curve Estimates, Erin Ledell, Maya L. Petersen, Mark J. Van Der Laan Dec 2012

Computationally Efficient Confidence Intervals For Cross-Validated Area Under The Roc Curve Estimates, Erin Ledell, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In binary classification problems, the area under the ROC curve (AUC), is an effective means of measuring the performance of your model. Most often, cross-validation is also used, in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we must obtain an estimate for its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, calculating the cross-validated AUC on even a relatively small data set can still require a …


A National Model Built With Partial Least Squares And Universal Kriging And Bootstrap-Based Measurement Error Correction Techniques: An Application To The Multi-Ethnic Study Of Atherosclerosis, Silas Bergen, Lianne Sheppard, Paul D. Sampson, Sun-Young Kim, Mark Richards, Sverre Vedal, Joel Kaufman, Adam A. Szpiro Dec 2012

A National Model Built With Partial Least Squares And Universal Kriging And Bootstrap-Based Measurement Error Correction Techniques: An Application To The Multi-Ethnic Study Of Atherosclerosis, Silas Bergen, Lianne Sheppard, Paul D. Sampson, Sun-Young Kim, Mark Richards, Sverre Vedal, Joel Kaufman, Adam A. Szpiro

UW Biostatistics Working Paper Series

Studies estimating health effects of long-term air pollution exposure often use a two-stage approach, building exposure models to assign individual-level exposures which are then used in regression analyses. This requires accurate exposure modeling and careful treatment of exposure measurement error. To illustrate the importance of carefully accounting for exposure model characteristics in two-stage air pollution studies, we consider a case study based on data from the Multi-Ethnic Study of Atherosclerosis (MESA). We present national spatial exposure models that use partial least squares and universal kriging to estimate annual average concentrations of four PM2.5 components: elemental carbon (EC), organic carbon (OC), …


Testing The Predictive Performance Of Distribution Models, Volker Bahn, Brian Mcgill Dec 2012

Testing The Predictive Performance Of Distribution Models, Volker Bahn, Brian Mcgill

Publications

Distribution models are used to predict the likelihood of occurrence or abundance of a species at locations where census data are not available. An integral part of modelling is the testing of model performance. We compared different schemes and measures for testing model performance using 79 species from the North American Breeding Bird Survey. The four testing schemes we compared featured increasing independence between test and training data: resubstitution, random data hold-out and two spatially segregated data hold-out designs. The different testing measures also addressed different levels of information content in the dependent variable: regression R2 for absolute abundance, squared …


Limited Sampling Estimates Of Epigallocatechin Gallate Exposures In Cirrhotic And Noncirrhotic Patients With Hepatitis C After Single Oral Doses Of Green Tea Extract., Dina Halegoua-De Marzio, Walter K. Kraft, Constantine Daskalakis, Xie Ying, Roy L Hawke, Victor J. Navarro Dec 2012

Limited Sampling Estimates Of Epigallocatechin Gallate Exposures In Cirrhotic And Noncirrhotic Patients With Hepatitis C After Single Oral Doses Of Green Tea Extract., Dina Halegoua-De Marzio, Walter K. Kraft, Constantine Daskalakis, Xie Ying, Roy L Hawke, Victor J. Navarro

Division of Gastroenterology and Hepatology Faculty Papers

BACKGROUND: Epigallocatechin-3-gallate (EGCG) has antiangiogenic, antioxidant, and antifibrotic properties that may have therapeutic potential for the treatment of cirrhosis induced by hepatitis C virus (HCV). However, cirrhosis might affect EGCG disposition and augment its reported dose-dependent hepatotoxic potential.

OBJECTIVE: The safety, tolerability, and disposition of a single oral dose of EGCG in cirrhotic patients with HCV were examined in an exploratory fashion.

METHODS: Eleven patients with hepatitis C and detectable viremia were enrolled. Four had Child-Pugh (CP) class A cirrhosis, 4 had Child-Pugh class B cirrhosis, and 3 were noncirrhotic. After a single oral dose of green tea extract 400 …


Stress-Lifetime Joint Distribution Model For Performance Degradation Failure, Quan Sun, Yanzhen Tang, Jing Feng, Paul Kvam Dec 2012

Stress-Lifetime Joint Distribution Model For Performance Degradation Failure, Quan Sun, Yanzhen Tang, Jing Feng, Paul Kvam

Department of Math & Statistics Faculty Publications

The high energy density self-healing metallized film pulse capacitor has been applied to all kinds of laser facilities for their power conditioning systems under several stress levels, such as 23kV, 30kV and 35kV, whose reliability performance and maintenance costs are affected by the reliability of capacitors. Due to the costs and time restriction, how to assess the reliability of highly reliable capacitors under a certain stress level as soon as possible becomes a challenge. Accelerated degradation test provides a way to predict its lifetime and reliability effectively. A model called stress-lifetime joint distribution model and an analysis method based on …


Nonparametric Inference For Meta Analysis With Fixed Unknown, Study-Specific Parameters, Brian Claggett, Minge Xie, Lu Tian Nov 2012

Nonparametric Inference For Meta Analysis With Fixed Unknown, Study-Specific Parameters, Brian Claggett, Minge Xie, Lu Tian

Harvard University Biostatistics Working Paper Series

No abstract provided.


Statistical Inference When Using Data Adaptive Estimators Of Nuisance Parameters, Mark J. Van Der Laan Nov 2012

Statistical Inference When Using Data Adaptive Estimators Of Nuisance Parameters, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In order to be concrete we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing n independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically …


Treatment Selections Using Risk-Benefit Profiles Based On Data From Comparative Randomized Clinical Trials With Multiple Endpoints, Brian Claggett, Lu Tian, Davide Castagno, L. J. Wei Nov 2012

Treatment Selections Using Risk-Benefit Profiles Based On Data From Comparative Randomized Clinical Trials With Multiple Endpoints, Brian Claggett, Lu Tian, Davide Castagno, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Likelihood Ratio Tests For The Mean Structure Of Correlated Functional Processes, Ana-Maria Staicu, Yingxing Li, Ciprian Crainiceanu, David M. Ruppert Nov 2012

Likelihood Ratio Tests For The Mean Structure Of Correlated Functional Processes, Ana-Maria Staicu, Yingxing Li, Ciprian Crainiceanu, David M. Ruppert

Johns Hopkins University, Dept. of Biostatistics Working Papers

The paper introduces a general framework for testing hypotheses about the structure of the mean function of complex functional processes. Important particular cases of the proposed framework are: 1) testing the null hypotheses that the mean of a functional process is parametric against a nonparametric alternative; and 2) testing the null hypothesis that the means of two possibly correlated functional processes are equal or differ by only a simple parametric function. A global pseudo likelihood ratio test is proposed and its asymptotic distribution is derived. The size and power properties of the test are confirmed in realistic simulation scenarios. Finite …


Longitudinal Functional Models With Structured Penalties, Madan G. Kundu, Jaroslaw Harezlak, Timothy W. Randolph Nov 2012

Longitudinal Functional Models With Structured Penalties, Madan G. Kundu, Jaroslaw Harezlak, Timothy W. Randolph

Johns Hopkins University, Dept. of Biostatistics Working Papers

Collection of functional data is becoming increasingly common including longitudinal observations in many studies. For example, we use magnetic resonance (MR) spectra collected over a period of time from late stage HIV patients. MR spectroscopy (MRS) produces a spectrum which is a mixture of metabolite spectra, instrument noise and baseline profile. Analysis of such data typically proceeds in two separate steps: feature extraction and regression modeling. In contrast, a recently-proposed approach, called partially empirical eigenvectors for regression (PEER) (Randolph, Harezlak and Feng, 2012), for functional linear models incorporates a priori knowledge via a scientifically-informed penalty operator in the regression function …


Group Testing Regression Models, Boan Zhang Nov 2012

Group Testing Regression Models, Boan Zhang

Department of Statistics: Dissertations, Theses, and Student Work

Group testing, where groups of individual specimens are composited to test for the presence or absence of a disease (or some other binary characteristic), is a procedure commonly used to reduce the costs of screening a large number of individuals. Statistical research in group testing has traditionally focused on a homogeneous population, where individuals are assumed to have the same probability of having a disease. However, individuals often have different risks of positivity, so recent research has examined regression models that allow for heterogeneity among individuals within the population. This dissertation focuses on two problems involving group testing regression models. …


G-Cimp Status Prediction Of Glioblastoma Samples Using Mrna Expression Data, Mehmet Baysan, Serdar Bozdag, Margaret C. Cam, Svetlana Kotliarova, Susie Ahn, Jennifer Walling, Jonathan K. Killian, Holly Stevenson, Paul Meltzer, Howard A. Fine Nov 2012

G-Cimp Status Prediction Of Glioblastoma Samples Using Mrna Expression Data, Mehmet Baysan, Serdar Bozdag, Margaret C. Cam, Svetlana Kotliarova, Susie Ahn, Jennifer Walling, Jonathan K. Killian, Holly Stevenson, Paul Meltzer, Howard A. Fine

Mathematics, Statistics and Computer Science Faculty Research and Publications

Glioblastoma Multiforme (GBM) is a tumor with high mortality and no known cure. The dramatic molecular and clinical heterogeneity seen in this tumor has led to attempts to define genetically similar subgroups of GBM with the hope of developing tumor specific therapies targeted to the unique biology within each of these subgroups. Recently, a subset of relatively favorable prognosis GBMs has been identified. These glioma CpG island methylator phenotype, or G-CIMP tumors, have distinct genomic copy number aberrations, DNA methylation patterns, and (mRNA) expression profiles compared to other GBMs. While the standard method for identifying G-CIMP tumors is based on …


Quest For Continuous Improvement: Gathering Feedback And Data Through Multiple Methods To Evaluate And Improve A Library’S Discovery Tool, Jeanne M. Brown Oct 2012

Quest For Continuous Improvement: Gathering Feedback And Data Through Multiple Methods To Evaluate And Improve A Library’S Discovery Tool, Jeanne M. Brown

Library Faculty Presentations

Summon at UNLV

  • Implemented fall 2011: a web-scale discovery tool
  • Expectations for Summon
  • Continuous Summon Improvement (CSI)Group

The environment

  • User changes
  • Library changes
  • Vendor changes
  • Product changes
  • Complex information environment
  • Change + complexity = need to assess using multiple streams of feedback


Pls-Rog: Partial Least Squares With Rank Order Of Groups, Hiroyuki Yamamoto Oct 2012

Pls-Rog: Partial Least Squares With Rank Order Of Groups, Hiroyuki Yamamoto

COBRA Preprint Series

Partial least squares (PLS), which is an unsupervised dimensionality reduction method, has been widely used in metabolomics. PLS can separate score depend on groups in a low dimensional subspace. However, this cannot use the information about rank order of groups. This information is often provided in which concentration of administered drugs to animals is gradually varies. In this study, we proposed partial least squares for rank order of groups (PLS-ROG). PLS-ROG can consider both separation and rank order of groups.


Statistical Hypothesis Test Of Factor Loading In Principal Component Analysis And Its Application To Metabolite Set Enrichment Analysis, Hiroyuki Yamamoto, Tamaki Fujimori, Hajime Sato, Gen Ishikawa, Kenjiro Kami, Yoshiaki Ohashi Oct 2012

Statistical Hypothesis Test Of Factor Loading In Principal Component Analysis And Its Application To Metabolite Set Enrichment Analysis, Hiroyuki Yamamoto, Tamaki Fujimori, Hajime Sato, Gen Ishikawa, Kenjiro Kami, Yoshiaki Ohashi

COBRA Preprint Series

Principal component analysis (PCA) has been widely used to visualize high-dimensional metabolomic data in a two- or three-dimensional subspace. In metabolomics, some metabolites (e.g. top 10 metabolites) have been subjectively selected when using factor loading in PCA, and biological inferences for these metabolites are made. However, this approach is possible to lead biased biological inferences because these metabolites are not objectively selected by statistical criterion. We proposed a statistical procedure to pick up metabolites by statistical hypothesis test of factor loading in PCA and make biological inferences by metabolite set enrichment analysis (MSEA) for these significant metabolites. This procedure depends …


Spectral Cross Correlation As A Supervised Approach For The Analysis Of Complex Raman Datasets: The Case Of Nanoparticles In Biological Cells, Mark Keating, Franck Bonnier, Hugh Byrne Oct 2012

Spectral Cross Correlation As A Supervised Approach For The Analysis Of Complex Raman Datasets: The Case Of Nanoparticles In Biological Cells, Mark Keating, Franck Bonnier, Hugh Byrne

Articles

Spectral Cross-correlation is introduced as a methodology to identify the presence and subcellular distribution of nanoparticles in cells. Raman microscopy is employed to spectroscopically image biological cells previously exposed to polystyrene nanoparticles, as a model for the study of nano-bio interactions. The limitations of previously deployed strategies of K-means clustering analysis and principal component analysis are discussed and a novel methodology of Spectral Cross Correlation Analysis is introduced and compared with the performance of Classical Least Squares Analysis, in both unsupervised and supervised modes. The previous study demonstrated the feasibility of using Raman spectroscopy to map cells and identify polystyrene …


Decline In Health For Older Adults: 5-Year Change In 13 Key Measures Of Standardized Health, Paula H. Diehr, Stephen M. Thielke, Anne B. Newman, Calvin H. Hirsch, Russell Tracy Oct 2012

Decline In Health For Older Adults: 5-Year Change In 13 Key Measures Of Standardized Health, Paula H. Diehr, Stephen M. Thielke, Anne B. Newman, Calvin H. Hirsch, Russell Tracy

UW Biostatistics Working Paper Series

Introduction

The health of older adults declines over time, but there are many ways of measuring health. We examined whether all measures declined at the same rate, or whether some aspects of health were less sensitive to aging than others.

Methods

We compared the decline in 13 measures of physical, mental, and functional health from the Cardiovascular Health Study: hospitalization, bed days, cognition, extremity strength, feelings about life as a whole, satisfaction with the purpose of life, self-rated health, depression, digit symbol substitution test, grip strength, ADLs, IADLs, and gait speed. Each measure was standardized against self-rated health. We compared …


Methods For Evaluating Prediction Performance Of Biomarkers And Tests, Margaret Pepe, Holly Janes Oct 2012

Methods For Evaluating Prediction Performance Of Biomarkers And Tests, Margaret Pepe, Holly Janes

UW Biostatistics Working Paper Series

This chapter describes and critiques methods for evaluating the performance of markers to predict risk of a current or future clinical outcome. We consider three criteria that are important for evaluating a risk model: calibration, benefit for decision making and accurate classification. We also describe and discuss a variety of summary measures in common use for quantifying predictive information such as the area under the ROC curve and R-squared. The roles and problems with recently proposed risk reclassification approaches are discussed in detail.


The Impact Of Covariance Misspecification In Multivariate Gaussian Mixtures On Estimation And Inference: An Application To Longitudinal Modeling, Brianna C. Heggeseth, Nicholas P. Jewell Oct 2012

The Impact Of Covariance Misspecification In Multivariate Gaussian Mixtures On Estimation And Inference: An Application To Longitudinal Modeling, Brianna C. Heggeseth, Nicholas P. Jewell

U.C. Berkeley Division of Biostatistics Working Paper Series

Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence---that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little …


Borrowing Information Across Populations In Estimating Positive And Negative Predictive Values, Ying Huang, Youyi Fong, John Wei, Ziding Feng Oct 2012

Borrowing Information Across Populations In Estimating Positive And Negative Predictive Values, Ying Huang, Youyi Fong, John Wei, Ziding Feng

UW Biostatistics Working Paper Series

A marker's capacity to predict risk of a disease depends on disease prevalence in the target population and its classification accuracy, i.e. its ability to discriminate diseased subjects from non-diseased subjects. The latter is often considered an intrinsic property of the marker; it is independent of disease prevalence and hence more likely to be similar across populations than risk prediction measures. In this paper, we are interested in evaluating the population-specific performance of a risk prediction marker in terms of positive predictive value (PPV) and negative predictive value (NPV) at given thresholds, when samples are available from the target population …


Causal Inference For Networks, Mark J. Van Der Laan Oct 2012

Causal Inference For Networks, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Suppose that we observe a population of causally connected units according to a network. On each unit we observe a set of potentially connected units that contains the true connections, and a longitudinal data structure, which includes time-dependent exposure or treatment, time-dependent covariates, a final outcome of interest. The target quantity of interest is defined as the mean outcome for this group of units if the exposures of the units would be probabilistically assigned according to a known specified mechanism, where the latter is called a stochastic intervention. Causal effects of interest are defined as contrasts of the mean of …


Targeted Learning Of The Probability Of Success Of An In Vitro Fertilization Program Controlling For Time-Dependent Confounders, Antoine Chambaz, Sherri Rose, Jean Bouyer, Mark J. Van Der Laan Oct 2012

Targeted Learning Of The Probability Of Success Of An In Vitro Fertilization Program Controlling For Time-Dependent Confounders, Antoine Chambaz, Sherri Rose, Jean Bouyer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Infertility is a global public health issue and various treatments are available. In vitro fertilization (IVF) is an increasingly common treatment method, but accurately assessing the success of IVF programs has proven challenging since they consist of multiple cycles. We present a double robust semiparametric method that incorporates machine learning to estimate the probability of success (i.e., delivery resulting from embryo transfer) of a program of at most four IVF cycles in the French Devenir Apr`es Interruption de la FIV (DAIFI) study and several simulation studies, controlling for time-dependent confounders. We find that the probability of success in the DAIFI …


Assessing The Causal Effect Of Policies: An Approach Based On Stochastic Interventions, Iván Díaz, Mark J. Van Der Laan Oct 2012

Assessing The Causal Effect Of Policies: An Approach Based On Stochastic Interventions, Iván Díaz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Stochastic interventions are a powerful tool to define parameters that measure the causal effect of a realistic intervention that intends to alter the population distribution of an exposure. In this paper we follow the approach described in D\'iaz and van der Laan (2011) to define and estimate the effect of an intervention that is expected to cause a truncation in the population distribution of the exposure. The observed data parameter that identifies the causal parameter of interest is established, as well as its efficient influence function under the non parametric model. Inverse probability of treatment weighted (IPTW), augmented IPTW and …


Reconstructability Of Epistatic Functions, Martin Zwick, Joe Fusion, Beth Wilmot Oct 2012

Reconstructability Of Epistatic Functions, Martin Zwick, Joe Fusion, Beth Wilmot

Systems Science Faculty Publications and Presentations

Background: Reconstructability Analysis (RA) has been used to detect epistasis in genomic data; in that work, even the simplest RA models (variable-based models without loops) gave performance superior to two other methods. A follow-on theoretical study showed that RA also offers higher-resolution models, namely variable-based models with loops and state-based models, likely to be even more effective in modeling epistasis, and also described several mathematical approaches to classifying types of epistasis.

Methods: The present paper extends this second study by discussing a non-standard use of RA: the analysis of epistasis in quantitative as opposed to nominal variables; such quantitative variables …