Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2010

Biostatistics

Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 142

Full-Text Articles in Physical Sciences and Mathematics

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel Dec 2010

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel

COBRA Preprint Series

In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …


Predicting Treatment Efficacy Via Quantitative Mri: A Bayesian Joint Model, Jincao Wu, Tim Johnson Dec 2010

Predicting Treatment Efficacy Via Quantitative Mri: A Bayesian Joint Model, Jincao Wu, Tim Johnson

The University of Michigan Department of Biostatistics Working Paper Series

The prognosis for patients with high-grade gliomas is poor, with a median survival of one year. Treatment efficacy assessment is typically unavailable until 5{6 months post diagnosis. Investigators hypothesize that quantitative MRI (qMRI) can assess treatment efficacy three weeks after therapy starts, thereby allowing salvage treatments to begin earlier. The purpose of this work is to build a predictive model of treatment efficacy using qMRI data and to assess its performance. The outcome is one-year survival status. We propose a joint, two-stage Bayesian model. In stage I, we smooth the image data with a multivariate spatio-temporal pairwise dierence prior. We …


Coronary Heart Disease Mortality And Long-Term Exposure To Ambient Particulate Air Pollutants In Elderly Nonsmoking California Residents, Lie Hong Chen Dec 2010

Coronary Heart Disease Mortality And Long-Term Exposure To Ambient Particulate Air Pollutants In Elderly Nonsmoking California Residents, Lie Hong Chen

Loma Linda University Electronic Theses, Dissertations & Projects

The purpose of this study is to assess the effect of long-term concentrations of ambient PM on risks of all causes, cardiopulmonary, coronary heart disease (CHD), total cancer, and any mention of nonmalignant respiratory disease (NMRD) mortality.

The health effects of long-term ambient air pollution have been studied with up to 30 years of follow-up in the AHSMOG cohort, a cohort of 6,338 nonsmoking white California adults. Monthly concentrations of ambient air pollutants [particulate matter(PMio), Ozone (O3), sulfur dioxide (SO2), nitrogen dioxide (NO2) or particulate matter

In the AHSMOG cohort, each increment of 10 |ig/m3 in PMio in two-pollutant models …


Spatial Epidemiology Of Birth Defects In The United States And The State Of Utah Using Geographic Information Systems And Spatial Statistics, Samson Y. Gebreab Dec 2010

Spatial Epidemiology Of Birth Defects In The United States And The State Of Utah Using Geographic Information Systems And Spatial Statistics, Samson Y. Gebreab

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Oral clefts are the most common form of birth defects in the United States (US) and the State of Utah has among the highest prevalence of oral clefts in the nation. The overall objective of this dissertation was to examine the spatial distribution of oral clefts and their linkage with a broad range of demographic, behavioral, social, economic, and environmental risk factors through the application of Geographic Information Systems (GIS) and spatial statistics. Using innovative linked micromaps plots, we investigated the geographic patterns of oral clefts occurrence from 1998 to 2002 and their relationships with maternal smoking rates and proportion …


The Determinants Of Colorectal Cancer Survival Disparities In Nevada, Lucas N. Wassira Dec 2010

The Determinants Of Colorectal Cancer Survival Disparities In Nevada, Lucas N. Wassira

UNLV Theses, Dissertations, Professional Papers, and Capstones

Different population groups across Nevada and throughout the United States suffer disproportionately from colorectal cancer and its after-effects. Overcoming cancer health disparities is important for lessening the burden of cancer. There has been an overall decline in the incidence of and mortality from colorectal cancer (CRC). This is likely due, in part, to the increasing use of screening procedures such as Fecal Occult Blood Test (FOBT) and/or endoscopy, which can reduce the risk of CRC mortality by fifty percent. Nevertheless, screening procedures are routinely used by only fifty percent of Americans aged fifty years and older. Despite overall mortality decreasing …


Asymptotic Theory For Cross-Validated Targeted Maximum Likelihood Estimation, Wenjing Zheng, Mark J. Van Der Laan Nov 2010

Asymptotic Theory For Cross-Validated Targeted Maximum Likelihood Estimation, Wenjing Zheng, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We consider a targeted maximum likelihood estimator of a path-wise differentiable parameter of the data generating distribution in a semi-parametric model based on observing n independent and identically distributed observations. The targeted maximum likelihood estimator (TMLE) uses V-fold sample splitting for the initial estimator in order to make the TMLE maximally robust in its bias reduction step. We prove a general theorem that states asymptotic efficiency (and thereby regularity) of the targeted maximum likelihood estimator when the initial estimator is consistent and a second order term converges to zero in probability at a rate faster than the square root of …


Cost-Efficient Variable Selection Using Branching Lars, Li Hua Yue Nov 2010

Cost-Efficient Variable Selection Using Branching Lars, Li Hua Yue

Electronic Thesis and Dissertation Repository

Variable selection is a difficult problem in statistical model building. Identification of cost efficient diagnostic factors is very important to health researchers, but most variable selection methods do not take into account the cost of collecting data for the predictors. The trade off between statistical significance and cost of collecting data for the statistical model is our focus. A Branching LARS (BLARS) procedure has been developed that can select and estimate the important predictors to build a model not only good at prediction but also cost efficient. BLARS method is an extension of the LARS variable selection method to incorporate …


Inferential Methods For High-Throughput Methylation Data, Maria Capparuccini Nov 2010

Inferential Methods For High-Throughput Methylation Data, Maria Capparuccini

Theses and Dissertations

The role of abnormal DNA methylation in the progression of disease is a growing area of research that relies upon the establishment of sound statistical methods. The common method for declaring there is differential methylation between two groups at a given CpG site, as summarized by the difference between proportions methylated db=b1-b2, has been through use of a Filtered Two Sample t-test, using the recommended filter of 0.17 (Bibikova et al., 2006b). In this dissertation, we performed a re-analysis of the data used in recommending the threshold by fitting a mixed-effects ANOVA model. It was determined that the 0.17 filter …


Observational Study And Individualized Antiretroviral Therapy Initiation Rules For Reducing Cancer Incidence In Hiv-Infected Patients, Romain Neugebauer, Michael J. Silverberg, Mark J. Van Der Laan Nov 2010

Observational Study And Individualized Antiretroviral Therapy Initiation Rules For Reducing Cancer Incidence In Hiv-Infected Patients, Romain Neugebauer, Michael J. Silverberg, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Targeted Maximum Likelihood Learning (TMLL) has been proposed as a general estimation methodology that can, in particular, be applied to draw causal inferences based on marginal structural modeling with observational data using either a point treatment approach (all confounders are assumed not to be affected by the exposure(s) of interest) or a longitudinal data approach (some confounders may be affected by one of the exposures of interest). While formal development of TMLL has included road maps for applications in longitudinal data approaches, real-life implementations have been restricted to studies based on a point treatment approach. In this article, we illustrate …


Survival Analysis Of Microarray Data With Microarray Measurement Subject To Measurement Error, Juan Xiong Nov 2010

Survival Analysis Of Microarray Data With Microarray Measurement Subject To Measurement Error, Juan Xiong

Electronic Thesis and Dissertation Repository

Microarray technology is essentially a measurement tool for measuring expressions of genes, and this measurement is subject to measurement error. Gene expressions could be employed as predictors for patient survival, and the measurement error involved in the gene expression is often ignored in the analysis of microarray data in the literature. Efforts are needed to establish statistical method for analyzing microarray data without ignoring the error in gene expression. A typical microarray data set has a large number of genes far exceeding the sample size. Proper selection of survival relevant genes contributes to an accurate prediction model. We study the …


Power And Sample Size For Three-Level Cluster Designs, Tina Cunningham Nov 2010

Power And Sample Size For Three-Level Cluster Designs, Tina Cunningham

Theses and Dissertations

Over the past few decades, Cluster Randomized Trials (CRT) have become a design of choice in many research areas. One of the most critical issues in planning a CRT is to ensure that the study design is sensitive enough to capture the intervention effect. The assessment of power and sample size in such studies is often faced with many challenges due to several methodological difficulties. While studies on power and sample size for cluster designs with one and two levels are abundant, the evaluation of required sample size for three-level designs has been generally overlooked. First, the nesting effect introduces …


A Novel Totivirus And Piscine Reovirus (Prv) In Atlantic Salmon (Salmo Salar) With Cardiomyopathy Syndrome (Cms), Torstein Tengs Nov 2010

A Novel Totivirus And Piscine Reovirus (Prv) In Atlantic Salmon (Salmo Salar) With Cardiomyopathy Syndrome (Cms), Torstein Tengs

Dr. Torstein Tengs

BACKGROUNDCardiomyopathy syndrome (CMS) is a severe disease affecting large farmed Atlantic salmon. Mortality often appears without prior clinical signs, typically shortly prior to slaughter. We recently reported the finding and the complete genomic sequence of a novel piscine reovirus (PRV), which is associated with another cardiac disease in Atlantic salmon; heart and skeletal muscle inflammation (HSMI). In the present work we have studied whether PRV or other infectious agents may be involved in the etiology of CMS.RESULTSUsing high throughput sequencing on heart samples from natural outbreaks of CMS and from fish experimentally challenged with material from fish diagnosed with CMS …


Stereotype Logit Models For High Dimensional Data, Andre Williams Oct 2010

Stereotype Logit Models For High Dimensional Data, Andre Williams

Theses and Dissertations

Gene expression studies are of growing importance in the field of medicine. In fact, subtypes within the same disease have been shown to have differing gene expression profiles (Golub et al., 1999). Often, researchers are interested in differentiating a disease by a categorical classification indicative of disease progression. For example, it may be of interest to identify genes that are associated with progression and to accurately predict the state of progression using gene expression data. One challenge when modeling microarray gene expression data is that there are more genes (variables) than there are observations. In addition, the genes usually demonstrate …


Mengukur Kualitas Hidup Anak, Toha Muhaimin Oct 2010

Mengukur Kualitas Hidup Anak, Toha Muhaimin

Kesmas

Kata kualitas hidup sering dihubungkan dengan pembangunan, khususnya pembangunan manusia, yang sering dikaitkan dengan kondisi seseorang baik dalam keadaan sehat maupun sakit, untuk menunjukkan aktivitas fisik, atau kondisi seseorang dalam hidup sehari-harinya. Sebagian orang mengkaitkan istilah kualitas hidup dengan kondisi sejauh mana terpenuhinya kebutuhan dasar untuk hidup seperti sandang, pangan, papan dan pendidikan pada seseorang. Oleh karena itu, banyak penelitian mengukur kualitas hidup dengan instrumen yang berbeda-beda, termasuk mengukur kualitas hidup anak dan banyak instrumen yang telah dikembangkan. Tulisan ini mencoba membahas pengertian kualitas hidup dan cara mengukurnya, terutama pada anak. Belum ada konsensus mengukur atau menggambarkan definisi konseptual kualitas …


Curriculum Vitae, Tatiyana V. Apanasovich Oct 2010

Curriculum Vitae, Tatiyana V. Apanasovich

Tatiyana V Apanasovich

No abstract provided.


Modeling Functional Data With Spatially Heterogeneous Shape Characteristics, Ana-Maria Staicu, Ciprian M. Crainiceanu, Daniel S. Reich, David Ruppert Oct 2010

Modeling Functional Data With Spatially Heterogeneous Shape Characteristics, Ana-Maria Staicu, Ciprian M. Crainiceanu, Daniel S. Reich, David Ruppert

Johns Hopkins University, Dept. of Biostatistics Working Papers

We propose a novel class of models for functional data exhibiting skewness or other shape characteristics that vary with spatial or temporal location. We use copulas so that the marginal distributions and the dependence structure can be modeled independently. Dependence is modeled with a Gaussian or t-copula, so that there is an underlying latent Gaussian process. We model the marginal distributions using the skew t family. The mean, variance, and shape parameters are modeled nonparametrically as functions of location. A computationally tractable inferential framework for estimating heterogeneous asymmetric or heavy-tailed marginal distributions is introduced. This framework provides a new set …


A Maximum Pseudo-Likelihood Approach For Estimating Species Trees Under The Coalescent Model, Liang Liu, Lili Yu, Scott V. Edwards Oct 2010

A Maximum Pseudo-Likelihood Approach For Estimating Species Trees Under The Coalescent Model, Liang Liu, Lili Yu, Scott V. Edwards

Biostatistics Faculty Publications

Background

Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the likelihood of a species tree under the multispecies coalescent model has already been derived by Rannala and Yang, it can be shown that the maximum likelihood estimate (MLE) of the species tree (topology, branch lengths, and population sizes) from gene trees under this formula does not exist. In this paper, we develop a pseudo-likelihood function of the species tree to obtain maximum pseudo-likelihood estimates (MPE) of species trees, …


Population Value Decomposition, A Framework For The Analysis Of Image Populations, Ciprian M. Crainiceanu, Brian S. Caffo, Sheng Luo, Vadim Zipunnikov Oct 2010

Population Value Decomposition, A Framework For The Analysis Of Image Populations, Ciprian M. Crainiceanu, Brian S. Caffo, Sheng Luo, Vadim Zipunnikov

Johns Hopkins University, Dept. of Biostatistics Working Papers

Images, often stored in multidimensional arrays are fast becoming ubiquitous in medical and public health research. Analyzing populations of images is a statistical problem that raises a host of daunting challenges. The most severe challenge is that data sets incorporating images recorded for hundreds or thousands of subjects at multiple visits are massive. We introduce the population value decomposition (PVD), a general method for simultaneous dimensionality reduction of large populations of massive images. We show how PVD can seamlessly be incorporated into statistical modeling and lead to a new, transparent and fast inferential framework. Our methodology was motivated by and …


On Nonparametric Comparison Of Images And Regression Surfaces, Xiao-Feng Wang, Deping Ye Oct 2010

On Nonparametric Comparison Of Images And Regression Surfaces, Xiao-Feng Wang, Deping Ye

Xiaofeng Wang

Multivariate local regression is an important tool for image processing and analysis. In many practical biomedical problems, one is often interested in comparing a group of images or regression surfaces. In this paper, we extend the existing method of testing the equality of nonparametric curves by Dette and Neumeyer (2001) and consider a test statistic by means of an L2-distance in the multi-dimensional case under a completely heteroscedastic nonparametric model. The test statistic is also extended to be used in the case of spatial correlated errors. Two bootstrap procedures are described in order to approximate the critical values of the …


The Pathways To Mental Health Care Of First-Episode Psychosis Patients: A Systematic Review., Kelly K. Anderson, Rebecca Fuhrer, Ashok K. Malla Oct 2010

The Pathways To Mental Health Care Of First-Episode Psychosis Patients: A Systematic Review., Kelly K. Anderson, Rebecca Fuhrer, Ashok K. Malla

Epidemiology and Biostatistics Publications

BACKGROUND: Although there is agreement on the association between delay in treatment of psychosis and outcome, less is known regarding the pathways to care of patients suffering from a first psychotic episode. Pathways are complex, involve a diverse range of contacts, and are likely to influence delay in treatment. We conducted a systematic review on the nature and determinants of the pathway to care of patients experiencing a first psychotic episode.

METHOD: We searched four databases (Medline, HealthStar, EMBASE, PsycINFO) to identify articles published between 1985 and 2009. We manually searched reference lists and relevant journals and used forward citation …


Targeted Bayesian Learning, Ivan Diaz Munoz, Alan E. Hubbard, Mark J. Van Der Laan Oct 2010

Targeted Bayesian Learning, Ivan Diaz Munoz, Alan E. Hubbard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Targeted maximum likelihood estimation (van der Laan & Rubin 2006) is a loss-based semi-parametric estimation method that yields a substitution estimator of a target parameter of the probability distribution of the data that solves the efficient influence curve estimating equation, and thereby yields a double robust locally efficient estimator of the parameter of interest, under regularity conditions. The Bayesian paradigm is concerned with including the researcher’s prior uncertainty about the parameter through a prior distribution, which combined with the likelihood yields a posterior distribution for the parameter that reflects the researcher’s posterior uncertainty. In this paper, we present a way …


Fast, Flexible Function-On-Scalar Regression, With An Application To Brain Development, Philip T. Reiss, Lei Huang Sep 2010

Fast, Flexible Function-On-Scalar Regression, With An Application To Brain Development, Philip T. Reiss, Lei Huang

Philip T. Reiss

No abstract provided.


Landmark Prediction Of Survival, Layla Parast, Tianxi Cai Sep 2010

Landmark Prediction Of Survival, Layla Parast, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Diagnosing And Responding To Violations In The Positivity Assumption, Maya L. Petersen, Kristin Porter, Susan Gruber, Yue Wang, Mark J. Van Der Laan Sep 2010

Diagnosing And Responding To Violations In The Positivity Assumption, Maya L. Petersen, Kristin Porter, Susan Gruber, Yue Wang, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The assumption of positivity or experimental treatment assignment requires that observed treatment levels vary within confounder strata. This article discusses the positivity assumption in the context of assessing model and parameter-specific identifiability of causal effects. Positivity violations occur when certain subgroups in a sample rarely or never receive some treatments of interest. The resulting sparsity in the data may increase bias with or without an increase in variance and can threaten valid inference. The parametric bootstrap is presented as a tool to assess the severity of such threats and its utility as a diagnostic is explored using simulated data. Several …


Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei Sep 2010

Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Sample Size And Statistical Power Considerations In High-Dimensionality Data Settings: A Comparative Study Of Classification Algorithms, Yu Guo, Armin Garber, Raji Balasubramanian Sep 2010

Sample Size And Statistical Power Considerations In High-Dimensionality Data Settings: A Comparative Study Of Classification Algorithms, Yu Guo, Armin Garber, Raji Balasubramanian

Raji Balasubramanian

Background: Data generated using ‘omics’ technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of …


Health Benefits Of Increased Walking For Sedentary, Generally Healthy Older Adults: Using Longitudinal Data To Approximate An Intervention Trial, Paula Diehr Sep 2010

Health Benefits Of Increased Walking For Sedentary, Generally Healthy Older Adults: Using Longitudinal Data To Approximate An Intervention Trial, Paula Diehr

Paula Diehr

BACKGROUND: Older adults are often advised to walk more, but randomized trials have not conclusively established the benefits of walking in this age group. Typical analyses based on observational data may have biased results. Here, we propose a "limited-bias," more interpretable estimate of the health benefits to sedentary healthy older adults of walking more, using longitudinal data from the Cardiovascular Health Study. METHODS: The number of city blocks walked per week, collected annually, was classified as sedentary (<7 blocks per>week), somewhat active, or active (>or=28). Analysis was restricted to persons sedentary and healthy in the first 2 years. In Year …


Spousal Concordance In Academic Achievements And Intelligence And Family-Based Association Studies Identified Novel Loci Associated With Intelligence., Yue Pan Aug 2010

Spousal Concordance In Academic Achievements And Intelligence And Family-Based Association Studies Identified Novel Loci Associated With Intelligence., Yue Pan

Electronic Theses and Dissertations

Assortative Mating, the tendency for mate selection to occur on the basis of similar traits, plays an essential role in understanding the genetic variation on academic achievements and intelligence (IQ). It is an important mechanism explaining spousal concordance. We used principal component analysis (PCA) for spousal correlation. There is a significant positive correlation between spouses by the new variable PC1 (correlation coefficient=0.515, p<0.0001). We further research the genetic factor that affects IQ by using the same data. We performed a low density genome-wide association (GWA) analysis with a family-based association test to identify genetic variants that associated with intelligence as measured by WAIS full-score IQ (FSIQ). NTM at 11q25 (rs411280, p=0.000764) and NR3C2 at 4q31.23 (rs3846329, p=0.000675) were 2 novel genes that haven't been associated with IQ from other studies. This study may serve as a resource for replication in other populations and a foundation for future investigations.


Distribution Of Health Care Expenditures For Hiv-Infected Patients, Ray Y. Chen, Neil A. Accortt, Andrew O. Westfall, Michael J. Mugavero, James L. Raper, Gretchen A. Cloud, Beth K. Stone, Jerome Carter, Stephanie Call, Maria Pisu, Jeroan J. Allison, Michael S. Saag Aug 2010

Distribution Of Health Care Expenditures For Hiv-Infected Patients, Ray Y. Chen, Neil A. Accortt, Andrew O. Westfall, Michael J. Mugavero, James L. Raper, Gretchen A. Cloud, Beth K. Stone, Jerome Carter, Stephanie Call, Maria Pisu, Jeroan J. Allison, Michael S. Saag

Jeroan J. Allison

BACKGROUND: Health care expenditures for persons infected with human immunodeficiency virus (HIV) in the United State determined on the basis of actual health care use have not been reported in the era of highly active antiretroviral therapy.

METHODS: Patients receiving primary care at the University of Alabama at Birmingham HIV clinic were included in the study. All encounters (except emergency room visits) that occurred within the University of Alabama at Birmingham Hospital System from 1 March 2000 to 1 March 2001 were analyzed. Medication expenditures were determined on the basis of 2001 average wholesale price. Hospitalization expenditures were determined on …


Trends In Aids-Defining And Non-Aids-Defining Malignancies Among Hiv-Infected Patients: 1989-2002, Roger Bedimo, Ray Y. Chen, Neil A. Accortt, James L. Raper, Carol Linn, Jeroan J. Allison, John Dubay, Michael S. Saag, Craig J. Hoesley Aug 2010

Trends In Aids-Defining And Non-Aids-Defining Malignancies Among Hiv-Infected Patients: 1989-2002, Roger Bedimo, Ray Y. Chen, Neil A. Accortt, James L. Raper, Carol Linn, Jeroan J. Allison, John Dubay, Michael S. Saag, Craig J. Hoesley

Jeroan J. Allison

In a comparison of rates of acquired immunodeficiency syndrome (AIDS)-defining malignancies (ADMs) for 1989-1996 versus 1997-2002, we found a decrease in ADMs (rate ratio, 0.31; P<.0001) and a significant increase in non-AIDS-defining malignancies (non-ADMs; rate ratio, 10.87; P<.0002). The mean CD4 cell count was lower among patients with ADMs than among those with non-ADMs. A longer duration of survival during highly active antiretroviral therapy might explain the increasing incidence of non-ADMs.