Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Statistics and Probability (62)
- Biostatistics (36)
- Statistical Methodology (22)
- Statistical Theory (22)
- Statistical Models (10)
-
- Medicine and Health Sciences (7)
- Multivariate Analysis (7)
- Survival Analysis (7)
- Epidemiology (6)
- Genetics and Genomics (6)
- Life Sciences (6)
- Public Health (6)
- Applied Mathematics (5)
- Numerical Analysis and Computation (5)
- Bioinformatics (4)
- Clinical Trials (4)
- Computational Biology (4)
- Genetics (2)
- Longitudinal Data Analysis and Time Series (2)
- Social and Behavioral Sciences (2)
- Categorical Data Analysis (1)
- Design of Experiments and Sample Surveys (1)
- Microarrays (1)
- Keyword
-
- Genetics (2)
- Mediation analysis; natural direct effects; natural indirect effects; multiple robustness; local efficiency (2)
- ANCOVA; cross validation; efficiency augmentation; Mayo PBC data; semi-parametric efficiency (1)
- Accelerometer; Matching; Time series (1)
- Adaptive design; asymptotic normality; canonical distribution; clinical trial; group-sequential testing; targeted maximum likelihood methodology (1)
-
- Adaptive designs; Clinical trials; Efficiency; Group sequential tests; Sample size modification; Sufficiency (1)
- Additive hazards model ; multiple robustness (1)
- Alzheimer's disease; covariate-specific ROC curve; ignorable missingness; verification bias; weighted estimating equations (1)
- Argeted maximum likelihood estimation; sequential randomizedcontrolled trials; efficient influence curve; numerical methods solutions;secant method; efficient semi-parametric estimation (1)
- Asymptotic linearity of an estimator; causal effect; efficient influence curve; confounding; G-computation formula; influence curve (1)
- Asymptotic linearity of an estimator; causal effect; efficient influence curve; empirical efficiency maximization; confounding; G-computation formula (1)
- Bayesian methods (1)
- Biomarker; restricted survival time; selection impact curve; statistical interaction (1)
- Bivariate survival data; Copula model; Interval sampling; Semi-stationarity. (1)
- Bootstrap; Functional principal components analysis; Iterated expectation and variance; Simultaneous bands (1)
- Case-control study; cross-validation; threshold regression model (1)
- Causal Effect; Counterfactual Outcome; Double Robustness; Stochastic Intervention; Targeted Maximum Likelihood Estimation (1)
- Causal Inference; Causal Effect; Targeted Maximum Likelihood Estimation; Point Treatment; Controlled DIrect Effect; tmle; R (1)
- Causal inference; Hypothesis test; Randomized clinical trial; Robustness; Superefficiency (1)
- Censored data; collaborative double robustness; collaborative targeted maximum likelihood estimation; double robust; estimator selection; inverse probability of censoring weighting (1)
- Clinical trail; Cox model; nonparametric estimation; presonalized medicine; perturbation-resampling method; stratified medicine; subgroup analysis; survival analysis (1)
- Complex bio signal (1)
- Contingency exchanges; expected utility; integer programming; kidney paired donation; microsimulation models; organ exchange; (1)
- Continuous exposure; robustness; targeted minimum loss estimation; variable importance measure (1)
- Cross-training-evaluation; Personalized medicine; Prediction; Stratified medicine; Subgroup analysis; Variable selection. (1)
- Diagnosis; Marker combination; Prognosis; ROC curve; Sensitivity; Specificity (1)
- Distribution-free (1)
- Doubly robust; estimating equation; missing at random; missing covariate; missing response (1)
- Efficiency (1)
- Efficiency; gene expression analysis; microarray; t-test (1)
Articles 1 - 30 of 65
Full-Text Articles in Physical Sciences and Mathematics
Identification And Efficient Estimation Of The Natural Direct Effect Among The Untreated, Samuel D. Lendle, Mark J. Van Der Laan
Identification And Efficient Estimation Of The Natural Direct Effect Among The Untreated, Samuel D. Lendle, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
The natural direct effect (NDE), or the effect of an exposure on an outcome if an intermediate variable was set to the level it would have been in the absence of the exposure, is often of interest to investigators. In general, the statistical parameter associated with the NDE is difficult to estimate in the non-parametric model, particularly when the intermediate variable is continuous or high dimensional. In this paper we introduce a new causal parameter called the natural direct effect among the untreated, discus identifiability assumptions, and show that this new parameter is equivalent to the NDE in a randomized …
Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng
Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng
Johns Hopkins University, Dept. of Biostatistics Working Papers
No abstract provided.
Modeling Criminal Careers As Departures From A Unimodal Population Age-Crime Curve: The Case Of Marijuana Use, Donatello Telesca, Elena Erosheva, Derek Kreager, Ross Matsueda
Modeling Criminal Careers As Departures From A Unimodal Population Age-Crime Curve: The Case Of Marijuana Use, Donatello Telesca, Elena Erosheva, Derek Kreager, Ross Matsueda
COBRA Preprint Series
A major aim of longitudinal analyses of life course data is to describe the within- and between-individual variability in a behavioral outcome, such as crime. Statistical analyses of such data typically draw on mixture and mixed-effects growth models. In this work, we present a functional analytic point of view and develop an alternative method that models individual crime trajectories as departures from a population age-crime curve. Drawing on empirical and theoretical claims in criminology, we assume a unimodal population age-crime curve and allow individual expected crime trajectories to differ by their levels of offending and patterns of temporal misalignment. We …
Toxicity Profiling Of Engineered Nanomaterials Via Multivariate Dose Response Surface Modeling, Trina Patel, Donatello Telesca, Saji George, Andre Nel
Toxicity Profiling Of Engineered Nanomaterials Via Multivariate Dose Response Surface Modeling, Trina Patel, Donatello Telesca, Saji George, Andre Nel
COBRA Preprint Series
New generation in-vitro high throughput screening (HTS) assays for the assessment of engineered nanomaterials provide an opportunity to learn how these particles interact at the cellular level, particularly in relation to injury pathways. These types of assays are often characterized by small sample sizes, high measurement error and high dimensionality as multiple cytotoxicity outcomes are measured across an array of doses and durations of exposure. In this article we propose a probability model for toxicity profiling of engineered nanomaterials. A hierarchical framework is used to account for the multivariate nature of the data by modeling dependence between outcomes and thereby …
Longitudinal High-Dimensional Data Analysis, Vadim Zipunnikov, Sonja Greven, Brian Caffo, Daniel S. Reich, Ciprian Crainiceanu
Longitudinal High-Dimensional Data Analysis, Vadim Zipunnikov, Sonja Greven, Brian Caffo, Daniel S. Reich, Ciprian Crainiceanu
Johns Hopkins University, Dept. of Biostatistics Working Papers
We develop a flexible framework for modeling high-dimensional functional and imaging data observed longitudinally. The approach decomposes the observed variability of high-dimensional observations measured at multiple visits into three additive components: a subject-specific functional random intercept that quantifies the cross-sectional variability, a subject-specific functional slope that quantifies the dynamic irreversible deformation over multiple visits, and a subject-visit specific functional deviation that quantifies exchangeable or reversible visit-to-visit changes. The proposed method is very fast, scalable to studies including ultra-high dimensional data, and can easily be adapted to and executed on modest computing infrastructures. The method is applied to the longitudinal analysis …
Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang
Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang
Johns Hopkins University, Dept. of Biostatistics Working Papers
In disease surveillance systems or registries, bivariate survival data are typically collected under interval sampling. It refers to a situation when entry into a registry is at the time of the first failure event (e.g., HIV infection) within a calendar time interval, the time of the initiating event (e.g., birth) is retrospectively identified for all the cases in the registry, and subsequently the second failure event (e.g., death) is observed during the follow-up. Sampling bias is induced due to the selection process that the data are collected conditioning on the first failure event occurs within a time interval. Consequently, the …
Likelihood Based Population Independent Component Analysis, Ani Eloyan, Ciprian M. Crainiceanu, Brian S. Caffo
Likelihood Based Population Independent Component Analysis, Ani Eloyan, Ciprian M. Crainiceanu, Brian S. Caffo
Johns Hopkins University, Dept. of Biostatistics Working Papers
Independent component analysis (ICA) is a widely used technique for blind source separation, used heavily in several scientific research areas including acoustics, electrophysiology and functional neuroimaging. We propose a scalable two-stage iterative true group ICA methodology for analyzing population level fMRI data where the number of subjects is very large. The method is based on likelihood estimators of the underlying source densities and the mixing matrix. As opposed to many commonly used group ICA algorithms the proposed method does not require significant data reduction by a twofold singular value decomposition. In addition, the method can be applied to a large …
Corrected Confidence Bands For Functional Data Using Principal Components, Jeff Goldsmith, Sonja Greven, Ciprian M. Crainiceanu
Corrected Confidence Bands For Functional Data Using Principal Components, Jeff Goldsmith, Sonja Greven, Ciprian M. Crainiceanu
Johns Hopkins University, Dept. of Biostatistics Working Papers
Functional principal components (FPC) analysis is widely used to decompose and express functional observations. Curve estimates implicitly condition on basis functions and other quantities derived from FPC decompositions; however these objects are unknown in practice. In this paper, we propose a method for obtaining correct curve estimates by accounting for uncertainty in FPC decompositions. Additionally, pointwise and simultaneous confidence intervals that account for both model- based and decomposition-based variability are constructed. Standard mixed-model representations of functional expansions are used to construct curve estimates and variances conditional on a specific decomposition. A bootstrap procedure is implemented to understand the uncertainty in …
Proxy Pattern-Mixture Analysis For A Binary Variable Subject To Nonresponse., Rebecca H. Andridge, Roderick J. Little
Proxy Pattern-Mixture Analysis For A Binary Variable Subject To Nonresponse., Rebecca H. Andridge, Roderick J. Little
The University of Michigan Department of Biostatistics Working Paper Series
We consider assessment of the impact of nonresponse for a binary survey
variable Y subject to nonresponse, when there is a set of covariates
observed for nonrespondents and respondents. To reduce dimensionality and
for simplicity we reduce the covariates to a continuous proxy variable X
that has the highest correlation with Y, estimated from a probit
regression analysis of respondent data. We extend our previously proposed
proxy-pattern mixture analysis (PPMA) for continuous outcomes to the binary
outcome using a latent variable approach. The method does not assume data
are missing at random, and creates a framework for sensitivity analyses.
Maximum …
Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan
Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
We define a new measure of variable importance of an exposure on a continuous outcome, accounting for potential confounders. The exposure features a reference level x0 with positive mass and a continuum of other levels. For the purpose of estimating it, we fully develop the semi-parametric estimation methodology called targeted minimum loss estimation methodology (TMLE) [van der Laan & Rubin, 2006; van der Laan & Rose, 2011]. We cover the whole spectrum of its theoretical study (convergence of the iterative procedure which is at the core of the TMLE methodology; consistency and asymptotic normality of the estimator), practical implementation, simulation …
Building A Nomogram For Survey-Weighted Cox Models Using R, Marinela Capanu, Mithat Gonen
Building A Nomogram For Survey-Weighted Cox Models Using R, Marinela Capanu, Mithat Gonen
Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series
Nomograms have become a very useful tool among clinicians as they provide individualized predictions based on the characteristics of the patient. For complex design survey data with survival outcome, Binder (1992) proposed methods for fitting survey-weighted Cox models, but to the best of our knowledge there is no available software to build a nomogram based on such models. This paper introduces R software to accomplish this goal and illustrates its use on a gastric cancer dataset. Validation and calibration routines are also included.
Bland-Altman Plots For Evaluating Agreement Between Solid Tumor Measurements, Chaya S. Moskowitz, Mithat Gonen
Bland-Altman Plots For Evaluating Agreement Between Solid Tumor Measurements, Chaya S. Moskowitz, Mithat Gonen
Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series
Rationale and Objectives. Solid tumor measurements are regularly used in clinical trials of anticancer therapeutic agents and in clinical practice managing patients' care. Consequently studies evaluating the reproducibility of solid tumor measurements are important as lack of reproducibility may directly affect patient management. The authors propose utilizing a modified Bland-Altman plot with a difference metric that lends itself naturally to this situation and facilitates interpretation. Materials and Methods. The modification to the Bland-Altman plot involves replacing the difference plotted on the vertical axis with the relative percent change (RC) between the two measurements. This quantity is the same one used …
A Regularization Corrected Score Method For Nonlinear Regression Models With Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, Donna Spiegelman
A Regularization Corrected Score Method For Nonlinear Regression Models With Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, Donna Spiegelman
Harvard University Biostatistics Working Paper Series
No abstract provided.
A Hybrid Bayesian Laplacian Approach For Generalized Linear Mixed Models, Marinela Capanu, Mithat Gonen, Colin B. Begg
A Hybrid Bayesian Laplacian Approach For Generalized Linear Mixed Models, Marinela Capanu, Mithat Gonen, Colin B. Begg
Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series
The analytical intractability of generalized linear mixed models (GLMMs) has generated a lot of research in the past two decades. Applied statisticians routinely face the frustrating prospect of widely disparate results produced by the methods that are currently implemented in commercially available software. This article is motivated by this frustration and develops guidance as well as new methods that are computationally efficient and statistically reliable. Two main classes of approximations have been developed: likelihood-based methods and Bayesian methods. Likelihood-based methods such as the penalized quasi-likelihood approach of Breslow and Clayton (1993) have been shown to produce biased estimates especially for …
A Proof Of Bell's Inequality In Quantum Mechanics Using Causal Interactions, James M. Robins, Tyler J. Vanderweele, Richard D. Gill
A Proof Of Bell's Inequality In Quantum Mechanics Using Causal Interactions, James M. Robins, Tyler J. Vanderweele, Richard D. Gill
COBRA Preprint Series
We give a simple proof of Bell's inequality in quantum mechanics which, in conjunction with experiments, demonstrates that the local hidden variables assumption is false. The proof sheds light on relationships between the notion of causal interaction and interference between particles.
Longitudinal Analysis Of Spatiotemporal Processes: A Case Study Of Dynamic Contrast-Enhanced Magnetic Resonance Imaging In Multiple Sclerosis, Russell T. Shinohara, Ciprian M. Crainiceanu, Brian S. Caffo, Daniel S. Reich
Longitudinal Analysis Of Spatiotemporal Processes: A Case Study Of Dynamic Contrast-Enhanced Magnetic Resonance Imaging In Multiple Sclerosis, Russell T. Shinohara, Ciprian M. Crainiceanu, Brian S. Caffo, Daniel S. Reich
Johns Hopkins University, Dept. of Biostatistics Working Papers
Multiple sclerosis (MS) is an immune-mediated disease in which inflammatory lesions form in the brain. In many active MS lesions, the blood-brain barrier (BBB) is disrupted and blood flows into white matter; this disruption may be related to morbidity and disability. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) allows quantitative study of blood flow and permeability dynamics throughout the brain. This technique involves a subject being imaged sequentially during a study visit as an intravenously administered contrast agent flows into the brain. In regions where flow is abnormal, such as white matter lesions, this allows the quantification of the BBB damage. …
Movelets: A Dictionary Of Movement, Jiawei Bai, Jeff Goldsmith, Brian Caffo, Thomas A. Glass, Ciprian M. Crainiceanu
Movelets: A Dictionary Of Movement, Jiawei Bai, Jeff Goldsmith, Brian Caffo, Thomas A. Glass, Ciprian M. Crainiceanu
Johns Hopkins University, Dept. of Biostatistics Working Papers
Recent technological advances provide researchers a way of gathering real-time information on an individual’s movement through the use of wearable devices that record acceleration. In this paper, we propose a method for identifying activity types, like walking, standing, and resting, from acceleration data. Our approach decomposes movements into short components called “movelets”, and builds a reference for each activity type. Unknown activities are predicted by matching new movelets to the reference. We apply our method to data collected from a single, three-axis accelerometer and focus on activities of interest in studying physical function in elderly populations. An important technical advantage …
Some Observations On The Wilcoxon Rank Sum Test, Scott S. Emerson
Some Observations On The Wilcoxon Rank Sum Test, Scott S. Emerson
UW Biostatistics Working Paper Series
This manuscript presents some general comments about the Wilcoxon rank sum test. Even the most casual reader will gather that I am not too impressed with the scientific usefulness of the Wilcoxon test. However, the actual motivation is more to illustrate differences between parametric, semiparametric, and nonparametric (distribution-free) inference, and to use this example to illustrate how many misconceptions have been propagated through a focus on (semi)parametric probability models as the basis for evaluating commonly used statistical analysis models. The document itself arose as a teaching tool for courses aimed at graduate students in biostatistics and statistics, with parts of …
The Importance Of Statistical Theory In Outlier Detection, Sarah C. Emerson, Scott S. Emerson
The Importance Of Statistical Theory In Outlier Detection, Sarah C. Emerson, Scott S. Emerson
UW Biostatistics Working Paper Series
We explore the performance of the outlier-sum statistic (Tibshirani and Hastie, Biostatistics 2007 8:2--8), a proposed method for identifying genes for which only a subset of a group of samples or patients exhibits differential expression levels. Our discussion focuses on this method as an example of how inattention to standard statistical theory can lead to approaches that exhibit some serious drawbacks. In contrast to the results presented by those authors, when comparing this method to several variations of the $t$-test, we find that the proposed method offers little benefit even in the most idealized scenarios, and suffers from a number …
Effectively Selecting A Target Population For A Future Comparative Study, Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, L. J. Wei
Effectively Selecting A Target Population For A Future Comparative Study, Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, L. J. Wei
Harvard University Biostatistics Working Paper Series
When comparing a new treatment with a control in a randomized clinical study, the treatment effect is generally assessed by evaluating a summary measure over a specific study population. The success of the trial heavily depends on the choice of such a population. In this paper, we show a systematic, effective way to identify a promising population, for which the new treatment is expected to have a desired benefit, using the data from a current study involving similar comparator treatments. Specifically, with the existing data we first create a parametric scoring system using multiple covariates to estimate subject-specific treatment differences. …
Targeted Minimum Loss Based Estimation Of An Intervention Specific Mean Outcome, Mark J. Van Der Laan, Susan Gruber
Targeted Minimum Loss Based Estimation Of An Intervention Specific Mean Outcome, Mark J. Van Der Laan, Susan Gruber
U.C. Berkeley Division of Biostatistics Working Paper Series
Targeted minimum loss based estimation (TMLE) provides a template for the construction of semiparametric locally efficient double robust substitution estimators of the target parameter of the data generating distribution in a semiparametric censored data or causal inference model based on a sample of independent and identically distributed copies from this data generating distribution (van der Laan and Rubin (2006), van der Laan (2008), van der Laan and Rose (2011)). TMLE requires 1) writing the target parameter as a particular mapping from a typically infinite dimensional parameter of the probability distribution of the unit data structure into the parameter space, 2) …
Population Intervention Causal Effects Based On Stochastic Interventions, Ivan Diaz Munoz, Mark J. Van Der Laan
Population Intervention Causal Effects Based On Stochastic Interventions, Ivan Diaz Munoz, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Estimating the causal effect of an intervention on a population typically involves defining parameters in a nonparametric structural equation model (Pearl, 2000, NPSEM) in which the treatment or exposure is deter- ministically assigned in a static or dynamic way. We define a new causal parameter that takes into account the fact that intervention policies can result in stochastically assigned exposures. The statistical parameter that identifies the causal parameter of interest is established. Inverse probability of treatment weighting (IPTW), augmented IPTW (A-IPTW), and targeted maximum likelihood estimators (TMLE) are developed. A simulation study is performed to demonstrate the properties of these …
Multiple Testing Of Local Maxima For Detection Of Peaks In Chip-Seq Data, Armin Schwartzman, Andrew Jaffe, Yulia Gavrilov, Clifford A. Meyer
Multiple Testing Of Local Maxima For Detection Of Peaks In Chip-Seq Data, Armin Schwartzman, Andrew Jaffe, Yulia Gavrilov, Clifford A. Meyer
Harvard University Biostatistics Working Paper Series
No abstract provided.
Targeted Maximum Likelihood Estimation Of Natural Direct Effect, Wenjing Zheng, Mark J. Van Der Laan
Targeted Maximum Likelihood Estimation Of Natural Direct Effect, Wenjing Zheng, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
In many causal inference problems, one is interested in the direct causal effect of an exposure on an outcome of interest that is not mediated by certain intermediate variables. Robins and Greenland (1992) and Pearl (2000) formalized the definition of two types of direct effects (natural and controlled) under the counterfactual framework. Since then, identifiability conditions for these effects have been studied extensively. By contrast, considerably fewer efforts have been invested in the estimation problem of the natural direct effect. In this article, we propose a semiparametric efficient, multiply robust estimator for the natural direct effect of a binary treatment …
On The Covariate-Adjusted Estimation For An Overall Treatment Difference With Data From A Randomized Comparative Clinical Trial, Lu Tian, Tianxi Cai, Lihui Zhao, L. J. Wei
On The Covariate-Adjusted Estimation For An Overall Treatment Difference With Data From A Randomized Comparative Clinical Trial, Lu Tian, Tianxi Cai, Lihui Zhao, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Targeted Minimum Loss Based Estimation Based On Directly Solving The Efficient Influence Curve Equation, Paul Chaffee, Mark J. Van Der Laan
Targeted Minimum Loss Based Estimation Based On Directly Solving The Efficient Influence Curve Equation, Paul Chaffee, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Applying targeted maximum likelihood estimation to longitudinal data can be computationally intensive. As the number of time points and/or number of intermediate factors grows, the computation resources consumed by these algorithms likewise increases. Different TMLE algorithms have different computational speeds and implementation challenges; there may also be efficiency differences of the corresponding estimators. The algorithm we describe here proceeds by solving the empirical efficient influence curve equation directly using numerical computation methods, rather than indirectly (by solving a score equation), which is the usual route. We believe that this estimator is the simplest of the TMLE procedures to implement in …
Variable Importance Analysis With The Multipim R Package, Stephan J. Ritter, Nicholas P. Jewell, Alan E. Hubbard
Variable Importance Analysis With The Multipim R Package, Stephan J. Ritter, Nicholas P. Jewell, Alan E. Hubbard
U.C. Berkeley Division of Biostatistics Working Paper Series
We describe the R package multiPIM, including statistical background, functionality and user options. The package is for variable importance analysis, and is meant primarily for analyzing data from exploratory epidemiological studies, though it could certainly be applied in other areas as well. The approach taken to variable importance comes from the causal inference field, and is different from approaches taken in other R packages. By default, multiPIM uses a double robust targeted maximum likelihood estimator (TMLE) of a parameter akin to the attributable risk. Several regression methods/machine learning algorithms are available for estimating the nuisance parameters of the models, including …
Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng
Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng
Johns Hopkins University, Dept. of Biostatistics Working Papers
Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an …
A Unified Approach To Non-Negative Matrix Factorization And Probabilistic Latent Semantic Indexing, Karthik Devarajan, Guoli Wang, Nader Ebrahimi
A Unified Approach To Non-Negative Matrix Factorization And Probabilistic Latent Semantic Indexing, Karthik Devarajan, Guoli Wang, Nader Ebrahimi
COBRA Preprint Series
Non-negative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into two matrices, W and H, each with nonnegative entries, V ~ WH. NMF has been shown to have a unique parts-based, sparse representation of the data. The nonnegativity constraints in NMF allow only additive combinations of the data which enables it to learn parts that have distinct physical representations in reality. In the last few years, NMF has been successfully applied in a variety of areas such as natural language processing, information retrieval, image processing, speech recognition …
Multiple Testing Of Local Maxima For Detection Of Unimodal Peaks In 1d, Armin Schwartzman, Yulia Gavrilov, Robert J. Adler
Multiple Testing Of Local Maxima For Detection Of Unimodal Peaks In 1d, Armin Schwartzman, Yulia Gavrilov, Robert J. Adler
Harvard University Biostatistics Working Paper Series
No abstract provided.