Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Biostatistics

A Network-Based Approach For Computational Drug Repurposing On Cancer Data, Ann Reba, Thomas Alexander Oct 2021

A Network-Based Approach For Computational Drug Repurposing On Cancer Data, Ann Reba, Thomas Alexander

Electronic Theses and Dissertations

In this thesis, we are interested in finding the best drugs that can be repurposed for the disease and able to find the adverse effects such drugs that are FDA-Approved. Developing an effective drug can be a time-consuming and expensive crucible method. Network-based machine learning methods are used for predicting a given drug for A that can be used for B. It aims at finding new indications for already existing drugs and therefore increases the available therapeutic choices at a fraction of the cost of new drug development. The perturbation gene expression data corresponding to the MCF7 cell line was …


The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi Oct 2021

The Classification Of Basket Neural Cells In The Mammalian Neocortex, Sreya Pudi

Senior Theses

Basket neuronal cells of the mammalian neocortex have been classically categorized into two or more groups. Originally, it was thought that the large and small types are the naturally occurring groups that emerge from reasons that relate to neurobiological function and anatomical position. Later, a study based on anatomical and physiological features of these neurons introduced a third type, the net basket cell which is intermediate in size as compared to the large and small types. In this study, multivariate analysis was used to test the hypothesis that the large and small types are morphologically distinct groups. The results of …


The Hybridizing Ions Treatment (Hit) Method Development And Computational Study On Sars-Cov-2 E Protein., Shengjie Sun May 2021

The Hybridizing Ions Treatment (Hit) Method Development And Computational Study On Sars-Cov-2 E Protein., Shengjie Sun

Open Access Theses & Dissertations

Fast and accurate calculations of the electrostatic features for highly charged biomolecules such as DNA, RNA, highly charged proteins, are crucial but challenging tasks. Traditional implicit solvent methods calculate the electrostatic features fast, but they are not able to balance the high net charges in the biomolecules effectively. Explicit solvent methods add unbalanced ions to neutralize the highly charged biomolecules in molecular dynamic simulations, which require more expensive computing resources. Here we developed a novel method, the Hybridizing Ions Treatment (HIT) method, which hybridizes the implicit solvent method with the explicit method to realistically calculate the electrostatic potential for highly …


Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil May 2021

Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil

Open Access Theses & Dissertations

With the rise of high throughput technologies in biomedical research, large volumes of expression profiling, methylation profiling, and RNA-sequencing data are being generated. These high-dimensional data have large number of features with small number of samples, a characteristic called the "curse of dimensionality." The selection of optimal features, which largely affects the performance of classification algorithms in machine learning models, has led to challenging problems in bioinformatics analyses of such high-dimensional datasets. In this work, I focus on the design of two-stage frameworks of feature selection and classification and their applications in multiple sets of colorectal cancer data. The first …


Mixture Model Approaches To Integrative Analysis Of Multi-Omics Data And Spatially Correlated Genomic Data, Ziqiao Wang May 2021

Mixture Model Approaches To Integrative Analysis Of Multi-Omics Data And Spatially Correlated Genomic Data, Ziqiao Wang

Dissertations & Theses (Open Access)

Integrative genomic data analysis is a powerful tool to study the complex biological processes behind a disease. Statistical methods can model the interrelationships of the involved gene activities through jointly analyzing multiple types of genomic data from different platforms (vertical integration), or improve the power of a study through aggregating the same type of genomic data across studies (horizontal integration). In this dissertation, we propose statistical methods and strategies for integrative multi-omics data in association analysis of disease phenotypes, with an emphasis on cancer applications.

We develop a new strategy based on horizontal integration by leveraging publicly available datasets into …


Impact Of Case Management On Childhood Lead Exposure In Marion County, Indiana, Maliki Yacouba Jan 2021

Impact Of Case Management On Childhood Lead Exposure In Marion County, Indiana, Maliki Yacouba

Walden Dissertations and Doctoral Studies

The Centers for Disease Control and Prevention recently declared that no amount of childhood blood lead level (BLL) is safe. The purpose of this quantitative study with a retrospective cohort design was to evaluate the effectiveness of case management intervention on children diagnosed with elevated BLL (EBLL; ≥ 5 μg/dL) in Marion, County, Indiana. The health belief model was used as the theoretical foundation for the study. A data set of 160 lead exposure case management records was analyzed to find whether: (a) BLL at post-case-management time significantly differ from BLL at baseline (b) BLL at post-case-management time is affected …


The Causes And Control Measures Of Extended Spectrum Beta-Lactamase Producing Enterobacteriaceae In Long-Term Care Facilities, Ismaila Olatunji Sule Jan 2021

The Causes And Control Measures Of Extended Spectrum Beta-Lactamase Producing Enterobacteriaceae In Long-Term Care Facilities, Ismaila Olatunji Sule

Walden Dissertations and Doctoral Studies

Due to extended-spectrum beta-lactamase-producing Enterobacteriaceae (ESBL-PE), infections among residents are increasing in long-term care facilities (LTCFs), resulting in high rate of morbidity and healthcare costs. ESBL-PE resists empirical antibiotics and reduces treatment options, and a designated infection control team is unavailable to prevent the prevalence of the disease. Ecological theory guided this study. A systematic review and meta-analysis were conducted to characterize the causes of ESBL-PE and evaluate the infection control strategies within LTCFs. Multiple regression analysis (MRA) was included as supplementary statistical analysis to identify relationships between LTCFs, geographical locations, infection control measures (ICMs), and ESBL-PE. A systematic search …


Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper Jan 2021

Maternal Proximity To Mountaintop Removal Mining And Birth Defects In Appalachian Kentucky, 1997-2003, Daniel B. Cooper

Theses and Dissertations--Public Health (M.P.H. & Dr.P.H.)

Background: Extraction of coal through mountaintop removal mining (MTR) alters many dimensions of the landscape, and explosive blasts, exposed rock, and coal washing have the potential to pollute air and water with substances known to increase risk of developmental and birth anomalies. Previous research suggests that infants born to mothers living in MTR coal mining counties have higher prevalence of most types of birth defects.

Objectives: This study seeks to examine further the relationship between MTR activity and birth defects by employing individual level exposure estimation through precise satellite data of MTR activity in the Appalachian region and maternal residence …


A Bayesian Hierarchical Mixture Model With Continuous-Time Markov Chains To Capture Bumblebee Foraging Behavior, Max Thrush Hukill Jan 2021

A Bayesian Hierarchical Mixture Model With Continuous-Time Markov Chains To Capture Bumblebee Foraging Behavior, Max Thrush Hukill

Honors Projects

The standard statistical methodology for analyzing complex case-control studies in ethology is often limited by approaches that force researchers to model distinct aspects of biological processes in a piecemeal, disjointed fashion. By developing a hierarchical Bayesian model, this work demonstrates that statistical inference in this context can be done using a single coherent framework. To do this, we construct a continuous-time Markov chain (CTMC) to model bumblebee foraging behavior. To connect the experimental design with the CTMC, we employ a mixture model controlled by a logistic regression on the two-factor design matrix. We then show how to infer these model …


Construction And Analysis Of Genetic Regulatory Networks With Rna-Seq Data From Arabidopsis Thaliana, Tessa Kriz Jan 2021

Construction And Analysis Of Genetic Regulatory Networks With Rna-Seq Data From Arabidopsis Thaliana, Tessa Kriz

Dissertations, Master's Theses and Master's Reports

Reconstruction of gene regulatory networks (GRNs) is a fundamental aspect of genetic engineering and provides a deeper understanding of the biological processes of an organism. Two methods were implemented to reconstruct the gene regulatory networks of Arabidopsis thaliana under two treatments: methyl jasmonate (MeJa) and salicylic acid (SA). The Joint Reconstruction of multiple Gene Regulatory Networks (JRmGRN) method was utilized to construct a joint network for identifying hub genes common to both conditions in addition to networks specific to each condition. The Differential Network Analysis with False Discover Rate Control method constructed a network of connections unique to only one …


Statistical Methods In Genetic Studies, Cheng Gao Jan 2021

Statistical Methods In Genetic Studies, Cheng Gao

Dissertations, Master's Theses and Master's Reports

This dissertation includes three Chapters. A brief description of each chapter is organized as follows.

In Chapter 1, we proposed a new method, called MF-TOWmuT, for genome-wide association studies with multiple genetic variants and multiple phenotypes using family samples. MF-TOWmuT uses kinship matrix to account for sample relatedness. It is worth mentioning that in simulations, we considered hidden polygenic effects and varied the proportion of variance contributed by it to generate phenotypes. Simulation studies show that MF-TOWmuT can preserve the type I error rates and is more powerful than several existing methods in different simulation scenarios, MFTOWmuT is also quite …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …