Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 29 of 29

Full-Text Articles in Biostatistics

Fuzzy Kc Clustering Imputation For Missing Not At Random Data, Markku A. Malmi Jr. Mar 2023

Fuzzy Kc Clustering Imputation For Missing Not At Random Data, Markku A. Malmi Jr.

USF Tampa Graduate Theses and Dissertations

Research has a variety of difficulties, especially when involving human subjects, and one of the most prevalent is the issue of missing data. Missing data will always be present in research due to the fact there is no perfect method for collecting data and protecting against human error or mechanical failure. This requires researchers to be able to mitigate the problems that come along with missing data; reduction in power of an analysis and bias introduced by the missing pattern. This research investigated a non-parametric method using a nested approach of fuzzy K-Modes and fuzzy C-Means clustering to impute missing …


Nonparametric Estimation Of Transition Probabilities In Illness-Death Model Based On Ranked Set Sampling, Ying Ma Jun 2022

Nonparametric Estimation Of Transition Probabilities In Illness-Death Model Based On Ranked Set Sampling, Ying Ma

USF Tampa Graduate Theses and Dissertations

The ranked set sampling (RSS) design is applied widely in agriculture, environmental science, and medical research where the exact measurements of sampling units is costly, but sampling units can be ranked by a correlated concomitant variable. RSS is usually a cost-efficient alternate to simple random sampling (SRS) for selecting more representative samples. This study presents a novel methodology to investigate the nonparametric estimation of transition probabilities in illness-death model using the RSS design. We study the Aalen–Johansen estimator of transition probabilities in illness-death Markov model based on RSS design under random right censoring time and propose nonparametric estimators of the …


Using Fine-Scale Aquatic Habitat Data To Construct Dreissenid Sdms In The Laurentian Great Lakes, Grace C. Henderson Mar 2022

Using Fine-Scale Aquatic Habitat Data To Construct Dreissenid Sdms In The Laurentian Great Lakes, Grace C. Henderson

USF Tampa Graduate Theses and Dissertations

The invasion of the Laurentian Great Lakes by aquatic invasive species (AIS) has been the subject of investigation for decades, due to their dramatic alterations to the ecosystem and high economic costs. Two AIS with the largest impacts are dreissenid zebra and quagga mussels, and though these species have been studied extensively, questions remain about what factors control their distributions, and whether lake warming will alter these distributions. Species distribution models (SDMs) offer a powerful tool to examine the relationship between species presences and environmental variables, which are typically bioclimactic data. The creation of the Aquatic Habitat (AqHab) dataset containing …


Measurements Of Generalizability And Adjustment For Bias In Clinical Trials, Yuanyuan Lu Mar 2022

Measurements Of Generalizability And Adjustment For Bias In Clinical Trials, Yuanyuan Lu

USF Tampa Graduate Theses and Dissertations

While randomized controlled trials (RCTs) are widely used as a gold standard in clinical research and public health, they are criticized because of a potential lack of generalizability, as the trial patients may be unrepresentative of the target patient population. Few research addresses how to assess and evaluate the generalizability of RCTs. As we know, patients are rarely selected on a random basis from a well-defined patient population of interest into a clinical trial. Generalizing findings from the RCT samples to the patient population has begun to receive increasing attention. We simulate a patient population with treatment effect size of …


Differential Privacy For Regression Modeling In Health: An Evaluation Of Algorithms, Joseph Ficek Nov 2021

Differential Privacy For Regression Modeling In Health: An Evaluation Of Algorithms, Joseph Ficek

USF Tampa Graduate Theses and Dissertations

Background: There is a need for rigorous and standardized methods of privacy protection for shared data in the health sciences. Differential privacy is one such method that has gained much popularity due to its versatility and robustness. This study evaluates differential privacy for explanatory regression modeling in the context of health research.

Methods: Surveyed and newly proposed algorithms were evaluated with respect to the accuracy (bias and RMSE) of coefficient estimates, the empirical coverage probability of confidence intervals, and the power and type I error rates of hypothesis tests. Evaluations took place in both simulated and real data from a …


Bayesian Multivariate Joint Modeling For Skewed-Longitudinal And Time-To-Event Data, Lan Xu Jun 2021

Bayesian Multivariate Joint Modeling For Skewed-Longitudinal And Time-To-Event Data, Lan Xu

USF Tampa Graduate Theses and Dissertations

In epidemiologic and clinical studies, a relatively large number of biomarkers are repeatedly measured in patients over time, often associated with data on epidemiologic and clinical interest events. So, much attention is focused on developing the specific patterns of the longitudinal measurements, and the associations between those patterns and the time to a certain event, such as heart attack, diagnose of disease, time to transplantation, or death. In the last two decades, the research into joint modeling of longitudinal and time-to-event data has received a tremendous amount of attention.

Numerous researchers have proposed joint modeling approaches for a single longitudinal …


Exploration Of Factors Associated With Perceptions Of Community Safety Among Youth In Hillsborough County, Florida: A Convergent Parallel Mixed-Methods Approach, Yingwei Yang Feb 2020

Exploration Of Factors Associated With Perceptions Of Community Safety Among Youth In Hillsborough County, Florida: A Convergent Parallel Mixed-Methods Approach, Yingwei Yang

USF Tampa Graduate Theses and Dissertations

Introduction: Youth perceived safety is not only linked to crime and violence in a neighborhood but is also associated with health risk behaviors and certain neighborhood characteristics. The purpose of this mixed-methods study was to measure the co-occurring effects of individual and community risk factors by conducting a secondary data analysis using structural equation modeling (SEM) and to explore reasons for youth feeling safe/unsafe in their community using photovoice methodology.

Methods: Syndemic theory/model served as the theoretical framework to guide this mixed-methods study with a convergent parallel design. The quantitative strand (first manuscript) utilized an existing dataset collected from middle …


Probabilistic Modeling Of Democracy, Corruption, Hemophilia A And Prediabetes Data, A. K. M. Raquibul Bashar Sep 2019

Probabilistic Modeling Of Democracy, Corruption, Hemophilia A And Prediabetes Data, A. K. M. Raquibul Bashar

USF Tampa Graduate Theses and Dissertations

Parametric analysis of any real-world data is the most powerful tool to characterize the probabilistic behavior in social, economic, medical, epidemiological, and other areas of study. In the present study, we identify the theoretical Probability Distribution Function(PDF) for Democracy Index Scores (DIS) from the Economist Intelligence Unit (EIU) database and estimate the maximum likelihood estimates of the theoretical PDFS. We also identify the individual PDFs for each of the clusters, Full Democracy, Flawed Democracy, Hybrid Regime, and Authoritarian Regime defined by the Economist Intelligence Unit (EIU).

A statistical model is a convenient instrument to predict the future value of any …


Flowgraph Models For Clustered Multistate Time To Event Data, Kristin Hall Nov 2018

Flowgraph Models For Clustered Multistate Time To Event Data, Kristin Hall

USF Tampa Graduate Theses and Dissertations

Healthcare systems have multistate processes. Such processes may be modeled using flowgraphs, which are directed graphs. Flowgraph models support a variety of transition time distributions, easily handle reversibility between states and allow alternate paths to the event or state of interest to be taken. However, estimation of flowgraph and first passage time distribution parameters can lead to incorrect inferences when interdependent data are treated as independent.

In this dissertation, we expand the flowgraph model to accommodate nested and correlated data structures. We develop a framework to incorporate random effects into transition probability and transition time components of a flowgraph model. …


Angiostrongylus Cantonensis: Epidemiologic Review, Location-Specific Habitat Modelling, And Surveillance In Hillsborough County, Florida, U.S.A., Brad Christian Perich Mar 2018

Angiostrongylus Cantonensis: Epidemiologic Review, Location-Specific Habitat Modelling, And Surveillance In Hillsborough County, Florida, U.S.A., Brad Christian Perich

USF Tampa Graduate Theses and Dissertations

Angiostrongylus cantonensis is a parasitic nematode endemic to tropical and subtropical regions and is the leading cause of human eosinophilic meningitis. The parasite is commonly known as rat lungworm because the primary host in its lifecycle is the rat. A clinical overview of rat lungworm infection is presented, followed by a literature review of rat lungworm epidemiology, risk factors, and surveillance projects. Data collected from previous snail surveys in Florida was considered alongside elevation, population per square kilometer, median household income by zip code territory, and normalized difference vegetation index specific to the geographic coordinates from which the snail samples …


Strategies To Adjust For Response Bias In Clinical Trials: A Simulation Study, Victoria R. Swaidan Feb 2018

Strategies To Adjust For Response Bias In Clinical Trials: A Simulation Study, Victoria R. Swaidan

USF Tampa Graduate Theses and Dissertations

Background: Response bias can distort treatment effect estimates and inferences in clinical trials. Although prevention, quantification, and adjustments have been developed, current methods are not applicable when subject-level reliability is used as the measure of response bias. Thus, the objective of the current study is to develop, test, and recommend a series of bias correction strategies for use in these cases. Methods: Monte Carlo simulation and logistic regression modeling were used to develop the strategies, examining the collective impact of sample size (N), effect size (ES), reliability distribution, and response style on estimating the treatment effect size in a series …


Bayesian Inference On Quantile Regression-Based Mixed-Effects Joint Models For Longitudinal-Survival Data From Aids Studies, Hanze Zhang Nov 2017

Bayesian Inference On Quantile Regression-Based Mixed-Effects Joint Models For Longitudinal-Survival Data From Aids Studies, Hanze Zhang

USF Tampa Graduate Theses and Dissertations

In HIV/AIDS studies, viral load (the number of copies of HIV-1 RNA) and CD4 cell counts are important biomarkers of the severity of viral infection, disease progression, and treatment evaluation. Recently, joint models, which have the capability on the bias reduction and estimates' efficiency improvement, have been developed to assess the longitudinal process, survival process, and the relationship between them simultaneously. However, the majority of the joint models are based on mean regression, which concentrates only on the mean effect of outcome variable conditional on certain covariates. In fact, in HIV/AIDS research, the mean effect may not always be of …


Efficiency Of An Unbalanced Design In Collecting Time To Event Data With Interval Censoring, Peiyao Cheng Nov 2016

Efficiency Of An Unbalanced Design In Collecting Time To Event Data With Interval Censoring, Peiyao Cheng

USF Tampa Graduate Theses and Dissertations

In longitudinal studies, the exact timing of an event often cannot be observed, and is usually detected at a subsequent visit, which is called interval censoring. Spacing of the visits is important when designing study with interval censored data. In a typical longitudinal study, the spacing of visits is usually the same across all subjects (balanced design). In this dissertation, I propose an unbalanced design: subjects at baseline are divided into a high risk group and a low risk group based on a risk factor, and the subjects in the high risk group are followed more frequently than those in …


Hidden Markov Chain Analysis: Impact Of Misclassification On Effect Of Covariates In Disease Progression And Regression, Haritha Polisetti Nov 2016

Hidden Markov Chain Analysis: Impact Of Misclassification On Effect Of Covariates In Disease Progression And Regression, Haritha Polisetti

USF Tampa Graduate Theses and Dissertations

Most of the chronic diseases have a well-known natural staging system through which the disease progression is interpreted. It is well established that the transition rates from one stage of disease to other stage can be modeled by multi state Markov models. But, it is also well known that the screening systems used to diagnose disease states may subject to error some times. In this study, a simulation study is conducted to illustrate the importance of addressing for misclassification in multi-state Markov models by evaluating and comparing the estimates for the disease progression Markov model with misclassification opposed to disease …


Modeling And Survival Analysis Of Breast Cancer: A Statistical, Artificial Neural Network, And Decision Tree Approach, Venkateswara Rao Mudunuru Mar 2016

Modeling And Survival Analysis Of Breast Cancer: A Statistical, Artificial Neural Network, And Decision Tree Approach, Venkateswara Rao Mudunuru

USF Tampa Graduate Theses and Dissertations

Survival analysis today is widely implemented in the fields of medical and biological sciences, social sciences, econometrics, and engineering. The basic principle behind the survival analysis implies to a statistical approach designed to take into account the amount of time utilized for a study period, or the study of time between entry into observation and a subsequent event. The event of interest pertains to death and the analysis consists of following the subject until death. Events or outcomes are defined by a transition from one discrete state to another at an instantaneous moment in time. In the recent years, research …


Bayesian Inference On Longitudinal Semi-Continuous Substance Abuse/Dependence Symptoms Data, Dongyuan Xing Sep 2015

Bayesian Inference On Longitudinal Semi-Continuous Substance Abuse/Dependence Symptoms Data, Dongyuan Xing

USF Tampa Graduate Theses and Dissertations

Substance use data such as alcohol drinking often contain a high proportion of zeros. In studies examining the alcohol consumption in college students, for instance, many students may not drink in the studied period, resulting in a number of zeros. Zero-inflated continuous data, also called semi continuous data, typically consist of a mixture of a degenerate distribution at the origin (zero) and a right-skewed, continuous distribution for the positive values. Ignoring the extreme non-normality in semi-continuous data may lead to substantially biased estimates and inference. Longitudinal or repeated measures of semi-continuous data present special challenges in statistical inference because of …


Statistical Modeling And Prediction Of Hiv/Aids Prognosis: Bayesian Analyses Of Nonlinear Dynamic Mixtures, Xiaosun Lu Jul 2014

Statistical Modeling And Prediction Of Hiv/Aids Prognosis: Bayesian Analyses Of Nonlinear Dynamic Mixtures, Xiaosun Lu

USF Tampa Graduate Theses and Dissertations

Statistical analyses and modeling have contributed greatly to our understanding of the pathogenesis of HIV-1 infection; they also provide guidance for the treatment of AIDS patients and evaluation of antiretroviral (ARV) therapies. Various statistical methods, nonlinear mixed-effects models in particular, have been applied to model the CD4 and viral load trajectories. A common assumption in these methods is all patients come from a homogeneous population following one mean trajectories. This assumption unfortunately obscures important characteristic difference between subgroups of patients whose response to treatment and whose disease trajectories are biologically different. It also may lack the robustness against population heterogeneity …


Age Dependent Analysis And Modeling Of Prostate Cancer Data, Nana Osei Mensa Bonsu Jan 2013

Age Dependent Analysis And Modeling Of Prostate Cancer Data, Nana Osei Mensa Bonsu

USF Tampa Graduate Theses and Dissertations

Growth rate of prostate cancer tumor is an important aspect of understanding the natural history of prostate cancer. Using real prostate cancer data from the SEER database with tumor size as a response variable, we have clustered the cancerous tumor sizes into age groups to enhance its analytical behavior. The rate of change of the response variable as a function of age is given for each cluster. Residual analysis attests to the quality of the analytical model and the subject estimates. In addition, we have identified the probability distribution that characterize the behavior of the response variable and proceeded with …


Multiple Calibrations In Integrative Data Analysis: A Simulation Study And Application To Multidimensional Family Therapy, Kristin Wynn Hall Jan 2013

Multiple Calibrations In Integrative Data Analysis: A Simulation Study And Application To Multidimensional Family Therapy, Kristin Wynn Hall

USF Tampa Graduate Theses and Dissertations

A recent advancement in statistical methodology, Integrative Data Analyses (IDA Curran & Hussong, 2009) has led researchers to employ a calibration technique as to not violate an independence assumption. This technique uses a randomly selected, simplified correlational structured subset, or calibration, of a whole data set in a preliminary stage of analysis. However, a single calibration estimator suffers from instability, low precision and loss of power. To overcome this limitation, a multiple calibration (MC; Greenbaum et al., 2013; Wang et al., 2013) approach has been developed to produce better estimators, while still removing a level of dependency in the data …


Uncontrolled Hypertension And Associated Factors In Hypertensive Patients At The Primary Healthcare Center Luis H. Moreno, Panama: A Feasibility Study, Roderick Ramon Chen Camano Jan 2013

Uncontrolled Hypertension And Associated Factors In Hypertensive Patients At The Primary Healthcare Center Luis H. Moreno, Panama: A Feasibility Study, Roderick Ramon Chen Camano

USF Tampa Graduate Theses and Dissertations

Background: According to the World Health Organization (WHO), hypertension is a major risk factor for cardiovascular disease (CVD), renal impairment, peripheral vascular disease, and blindness. In Panama, a recent study estimated the prevalence of hypertension at 38.5% in the two main provinces of the country, with a rate of uncontrolled hypertension of 47.2%.

Objectives: The aims of this study were to assess the feasibility of the study design and to describe the characteristics of the hypertensive population and the physician's adherence to Panamanian antihypertensive protocols and their relationship with uncontrolled hypertension.

Methods: This is a cross-sectional study of adult hypertensive …


A Latent Mixture Approach To Modeling Zero-Inflated Bivariate Ordinal Data, Rajendra Kadel Jan 2013

A Latent Mixture Approach To Modeling Zero-Inflated Bivariate Ordinal Data, Rajendra Kadel

USF Tampa Graduate Theses and Dissertations

Multivariate ordinal response data, such as severity of pain, degree of disability, and satisfaction with a healthcare provider, are prevalent in many areas of research including public health, biomedical, and social science research. Ignoring the multivariate features of the response variables, that is, by not taking the correlation between the errors across models into account, may lead to substantially biased estimates and inference. In addition, such multivariate ordinal outcomes frequently exhibit a high percentage of zeros (zero inflation) at the lower end of the ordinal scales, as compared to what is expected under a multivariate ordinal distribution. Thus, zero inflation …


A Monte Carlo Approach To Change Point Detection In A Liver Transplant, Alexia Melissa Makris Jan 2013

A Monte Carlo Approach To Change Point Detection In A Liver Transplant, Alexia Melissa Makris

USF Tampa Graduate Theses and Dissertations

Patient survival post liver transplant (LT) is important to both the patient and the center's accreditation, but over the years physicians have noticed that distant patients struggle with post LT care. I hypothesized that patient's distance from the transplant center had a detrimental effect on post LT survival. I suspected Hepatitis C (HCV) and Hepatocellular Carcinoma (HCC) patients would deteriorate due to their recurrent disease and there is a need for close monitoring post LT. From the current literature it was not clear if patients' distance from a transplant center affects outcomes post LT. Firozvi et al. (Firozvi AA, 2008) …


Evaluation Of Repeated Biomarkers: Non-Parametric Comparison Of Areas Under The Receiver Operating Curve Between Correlated Groups Using An Optimal Weighting Scheme, Ping Xu Jan 2012

Evaluation Of Repeated Biomarkers: Non-Parametric Comparison Of Areas Under The Receiver Operating Curve Between Correlated Groups Using An Optimal Weighting Scheme, Ping Xu

USF Tampa Graduate Theses and Dissertations

Receiver Operating Characteristic (ROC) curves are often used to evaluate the prognostic performance of a continuous biomarker. In a previous research, a non-parametric ROC approach was introduced to compare two biomarkers with repeated measurements. An asymptotically normal statistic, which contains the subject-specific weights, was developed to estimate the areas under the ROC curve of biomarkers. Although two weighting schemes were suggested to be optimal when the within subject correlation is 1 or 0 by the previous study, the universal optimal weight was not determined. We modify this asymptotical statistic to compare AUCs between two correlated groups and propose a solution …


Linear Mixed-Effects Models: Applications To The Behavioral Sciences And Adolescent Community Health, Lizmarie Gabriela Maldonado Jan 2012

Linear Mixed-Effects Models: Applications To The Behavioral Sciences And Adolescent Community Health, Lizmarie Gabriela Maldonado

USF Tampa Graduate Theses and Dissertations

Linear mixed-effects (LME) modeling is a widely used statistical method for analyzing repeated measures or longitudinal data. Such longitudinal studies typically aim to investigate and describe the trajectory of a desired outcome. Longitudinal data have the advantage over cross-sectional data by providing more accuracy for the model. LME models allow researchers to account for random variation among individuals and between individuals.

In this project, adolescent health was chosen as a topic of research due to the many changes that occur during this crucial time period as a precursor to overall well-being in adult life. Understanding the factors that influence how …


Bayesian Inference On Mixed-Effects Models With Skewed Distributions For Hiv Longitudinal Data, Ren Chen Jan 2012

Bayesian Inference On Mixed-Effects Models With Skewed Distributions For Hiv Longitudinal Data, Ren Chen

USF Tampa Graduate Theses and Dissertations

Statistical models have greatly improved our understanding of the pathogenesis of HIV-1 infection

and guided for the treatment of AIDS patients and evaluation of antiretroviral (ARV) therapies.

Although various statistical modeling and analysis methods have been applied for estimating the

parameters of HIV dynamics via mixed-effects models, a common assumption of distribution is

normal for random errors and random-effects. This assumption may lack the robustness against

departures from normality so may lead misleading or biased inference. Moreover, some covariates

such as CD4 cell count may be often measured with substantial errors. Bivariate clustered

(correlated) data are also commonly encountered in …


Statistical Estimation Of Physiologically-Based Pharmacokinetic Models: Identifiability, Variation, And Uncertainty With An Illustration Of Chronic Exposure To Dioxin And Dioxin-Like-Compounds., Zachary John Thompson Jan 2012

Statistical Estimation Of Physiologically-Based Pharmacokinetic Models: Identifiability, Variation, And Uncertainty With An Illustration Of Chronic Exposure To Dioxin And Dioxin-Like-Compounds., Zachary John Thompson

USF Tampa Graduate Theses and Dissertations

Assessment of human exposure to environmental chemicals is inherently subject to uncertainty and variability. There are data gaps concerning the inventory, source, duration, and intensity of exposure

as well as knowledge gaps regarding pharmacokinetics in general. These gaps result in uncertainties in exposure assessment.

The uncertainties compound further with variabilities due to population variations regarding stage of life, life style, and susceptibility,

etc. Use of physiologically-based pharmacokinetic (PBPK) models promises to reduce the uncertainties and enhance extrapolation between species, between routes, from high to low dose, and from acute to chronic exposure. However, fitting PBPK models is challenging because of …


Gender Differences In Lung Cancer Treatment And Survival, Margaret Anne Kowski Jan 2011

Gender Differences In Lung Cancer Treatment And Survival, Margaret Anne Kowski

USF Tampa Graduate Theses and Dissertations

The objectives of this research were to test treatment and survival differences between women and men with lung cancer as there is minimal investigation in the literature. Three research questions were developed with statistical testing for gender differences based on similar cancer type, stage, treatment assignment and survival. Data for 44,863 primary lung cancer cases were collected from eight U.S. state-based cancer registries to investigate the research questions. The lung cancer incidence data included the morphological cell-types of adenocarcinoma (AC); squamous cell carcinoma (SCC); large cell carcinoma (LCC) and small cell carcinoma (SCC). Stage, grade, treatment type, as well as, …


A Novel Device For Cell-Cell Electrofusion, Justin T. Stewart Jan 2011

A Novel Device For Cell-Cell Electrofusion, Justin T. Stewart

USF Tampa Graduate Theses and Dissertations

Cell transplantation therapy is a potentially powerful tool and can be used to replace defective cells with healthy cells. This offers the possibility of alleviating the destructive symptoms for many diseases such as Parkinson's disease, Alzheimer's disease, stroke, spinal cord trauma, Type I diabetes and many more. While there are many diseases that could be positively impacted from cell transplantation therapy, the focus of this research is insulin dependent, Type I Diabetes.

The Islets of Langerhans are composed of various types of cells located in the pancreas and are responsible for a variety of biochemical functions. Specifically, the beta Islet …


Statistical Models For Environmental And Health Sciences, Yong Xu Jan 2011

Statistical Models For Environmental And Health Sciences, Yong Xu

USF Tampa Graduate Theses and Dissertations

Statistical analysis and modeling are useful for understanding the behavior of different phenomena. In this study we will focus on two areas of applications: Global warming and cancer research. Global Warming is one of the major environmental challenge people face nowadays and cancer is one of the major health problem that people need to solve.

For Global Warming, we are interest to do research on two major contributable variables: Carbon dioxide (CO2) and atmosphere temperature. We will model carbon dioxide in the atmosphere data with a system of differential equations. We will develop a differential equation for each of six …