Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Louisville

Discipline
Keyword
Publication Year
Publication
Publication Type

Articles 31 - 43 of 43

Full-Text Articles in Biostatistics

Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia Aug 2016

Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia

Electronic Theses and Dissertations

This dissertation consists of two interconnected research projects. The first project was a study of propensity scores based statistical methods for estimating the average treatment effect (ATE) and the average treatment effect among treated (ATT) when there are two treatment groups. The ATE is defined as the mean of the individual causal effects in the whole population, while ATT is defined as the treatment effect for the treated population. Propensity score based statistical methods, such as matching, regression, stratification, inverse probability weighting (IPW), and doubly robust (DR) methods were used to estimate the ATE and ATT. Simulation studies and case …


Inference For A Zero-Inflated Conway-Maxwell-Poisson Regression For Clustered Count Data., Hyoyoung Choo-Wosoba May 2016

Inference For A Zero-Inflated Conway-Maxwell-Poisson Regression For Clustered Count Data., Hyoyoung Choo-Wosoba

Electronic Theses and Dissertations

This dissertation is directed toward developing a statistical methodology with applications of the Conway-Maxwell-Poisson (CMP) distribution (Conway, R. W., and Maxwell, W. L., 1962) to count data. The count data for this dissertation exhibit three different characteristics: clustering, zero inflation, and dispersion. Clustering suggests that observations within clusters are correlated, and the zero inflation phenomenon occurs when the data exhibit excessive zero counts. Dispersion implies that the mean is greater/smaller than the variance unlike a Poisson distribution. The dissertation starts with an introduction of inference for a zero-inflated clustered count data in the first chapter. Then, it presents novel methodologies …


A Log Rank Test For Clustered Data Under Informative Within-Cluster Group Size., Mary Elizabeth Gregg May 2016

A Log Rank Test For Clustered Data Under Informative Within-Cluster Group Size., Mary Elizabeth Gregg

Electronic Theses and Dissertations

The log rank test is a popular nonparametric test for comparing the marginal survival distribution of two groups. When data are organized within clusters and the size of clusters or the distribution of group membership within a cluster is related to an outcome of interest, traditional methods of data analysis can be biased. In this thesis, we develop a within-cluster group weighted log rank test to compare marginal survival time distributions between groups from clustered data, correcting for cluster size and intra-cluster group size informativeness. The performance of this new test is compared with the unweighted and cluster-weighted log rank …


Some Contributions To Nonparametric And Semiparametric Inference For Clustered And Multistate Data., Sandipan Dutta May 2016

Some Contributions To Nonparametric And Semiparametric Inference For Clustered And Multistate Data., Sandipan Dutta

Electronic Theses and Dissertations

This dissertation is composed of research projects that involve methods which can be broadly classified as either nonparametric or semiparametric. Chapter 1 provides an introduction of the problems addressed in these projects, a brief review of the related works that have done so far, and an outline of the methods developed in this dissertation. Chapter 2 describes in details the first project which aims at developing a rank-sum test for clustered data where an outcome from group in a cluster is associated with the number of observations belonging to that group in that cluster. Chapter 3 proposes the use of …


Semi-Parametric Methods For Personalized Treatment Selection And Multi-State Models., Chathura K. Siriwardhana May 2016

Semi-Parametric Methods For Personalized Treatment Selection And Multi-State Models., Chathura K. Siriwardhana

Electronic Theses and Dissertations

This dissertation contains three research projects on personalized medicine and a project on multi-state modelling. The idea behind personalized medicine is selecting the best treatment that maximizes interested clinical outcomes of an individual based on his or her genetic and genomic information. We propose a method for treatment assignment based on individual covariate information for a patient. Our method covers more than two treatments and it can be applied with a broad set of models and it has very desirable large sample properties. An empirical study using simulations and a real data analysis show the applicability of the proposed procedure. …


Propensity Score Methods : A Simulation And Case Study Involving Breast Cancer Patients., John Craycroft May 2016

Propensity Score Methods : A Simulation And Case Study Involving Breast Cancer Patients., John Craycroft

Electronic Theses and Dissertations

Observational data presents unique challenges for analysis that are not encountered with experimental data resulting from carefully designed randomized controlled trials. Selection bias and unbalanced treatment assignments can obscure estimations of treatment effects, making the process of causal inference from observational data highly problematic. In 1983, Paul Rosenbaum and Donald Rubin formalized an approach for analyzing observational data that adjusts treatment effect estimates for the set of non-treatment variables that are measured at baseline. The propensity score is the conditional probability of assignment to a treatment group given the covariates. Using this score, one may balance the covariates across treatment …


Novel Applications Of And Extensions To Linear Regression Methods For The Biomedical And Materials Sciences., Joe Bible May 2015

Novel Applications Of And Extensions To Linear Regression Methods For The Biomedical And Materials Sciences., Joe Bible

Electronic Theses and Dissertations

In this work we present three topics, each of which centered on either the application or modification of various linear regression methods. Our work with respect to the “Materials Genome” project while undermined by oversimplification and data integrity issues in its early stages, provides a sound platform from which the project can proceed successfully. Building upon a growing body of knowledge around the use of Weighted Generalized Estimating Equations (WGEE), our second investigation proposes an extension to that framework intended to address the inherent bias present in the analysis of clustered longitudinal data with potentially informative cluster sizes and temporal …


Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula May 2015

Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula

Electronic Theses and Dissertations

Determining the best clustering algorithm and ideal number of clusters for a particular dataset is a fundamental difficulty in unsupervised clustering analysis. In biological research, data generated from Next Generation Sequencing technology and microarray gene expression data are becoming more and more common, so new tools and resources are needed to group such high dimensional data using clustering analysis. Different clustering algorithms can group data very differently. Therefore, there is a need to determine the best groupings in a given dataset using the most suitable clustering algorithm for that data. This paper presents the R package optCluster as an efficient …


Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990- May 2015

Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990-

Electronic Theses and Dissertations

The research conducted for this thesis was performed to summarize some of the most commonly used survival analysis techniques as well as to create one macro that will provide the solutions for these techniques. Some of the techniques that this thesis focuses on are survival and hazard functions, mean and median survival times, life table, log rank test, proportional hazards/model building, and competing risk. To further analyze these survival analysis techniques I will use the Bone Marrow Transplantation for Leukemia dataset. This trial consists of either acute myelocytic leukemia (AML 99 patients) or acute lymphoblastic leukemia (ALL 38 patients). There …


Penalized Regressions For Variable Selection Model, Single Index Model And An Analysis Of Mass Spectrometry Data., Yubing Wan Aug 2014

Penalized Regressions For Variable Selection Model, Single Index Model And An Analysis Of Mass Spectrometry Data., Yubing Wan

Electronic Theses and Dissertations

The focus of this dissertation is to develop statistical methods, under the framework of penalized regressions, to handle three different problems. The first research topic is to address missing data problem for variable selection models including elastic net (ENet) method and sparse partial least squares (SPLS). I proposed a multiple imputation (MI) based weighted ENet (MI-WENet) method based on the stacked MI data and a weighting scheme for each observation. Numerical simulations were implemented to examine the performance of the MIWENet method, and compare it with competing alternatives. I then applied the MI-WENet method to examine the predictors for the …


Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984- May 2014

Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984-

Electronic Theses and Dissertations

Though randomized clinical (RCTs) trials are the gold standard for comparing treatments, they are often infeasible or exclude clinically important subjects, or generally represent an idealized medical setting rather than real practice. Observational data provide an opportunity to study practice-based evidence, but also present challenges for analysis. Traditional statistical methods which are suitable for RCTs may be inadequate for the observational studies. In this project, four of the most popular statistical methods for observational studies: ANCOVA, propensity score matching, regression with the propensity score as a covariate, and instrumental variables (IV) are investigated through application to MarketScan insurance claims data. …


Patient Rule Induction Method For Subgroup Identification Given Censored Data., Patrick James Trainor May 2014

Patient Rule Induction Method For Subgroup Identification Given Censored Data., Patrick James Trainor

Electronic Theses and Dissertations

The identification of subgroups in clinical studies is an important aspect of personalized medicine. In order to develop tailored therapeutics, the factors that characterize subgroups with differential prognosis, response to treatment, and incidence of adverse events or toxicities must be elucidated. We present a generalization of a statistical learning algorithm, Patient Rule Induction Method (PRIM), that is well suited for this task given a right-censored time-to-event outcome measure. This algorithm works to recursively partition a covariate space into mutually exclusive boxes that can be utilized to define subgroups. Conceptually the algorithm is similar to classification and regression trees but rather …


Compound Identification Using Penalized Linear Regression., Ruiqi Liu May 2013

Compound Identification Using Penalized Linear Regression., Ruiqi Liu

Electronic Theses and Dissertations

In this study, we propose a new method for compound identification using penalized linear regression. Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. In the context of the linear regression, the response variable is an experimental mass spectrum (i.e., query) and all the compounds in the reference library are the independent variables. However, the number of compounds in the reference library is much larger than the range of m/z values so that the data become high dimensional data with suffering from singularity. For …