Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Louisville

Electronic Theses and Dissertations

Discipline
Keyword
Publication Year

Articles 1 - 24 of 24

Full-Text Articles in Bioinformatics

Novel Insights Into Oligodendrocyte Biology From Developmental Myelination Studies In Autophagy Deficient Mice And Analysis Of Oligodendrocyte Translatome Response To Contusive Spinal Cord Injury., Michael David Forston Aug 2023

Novel Insights Into Oligodendrocyte Biology From Developmental Myelination Studies In Autophagy Deficient Mice And Analysis Of Oligodendrocyte Translatome Response To Contusive Spinal Cord Injury., Michael David Forston

Electronic Theses and Dissertations

Loss of myelin causes severe neurological disorders and functional deficits in white matter injuries (WMI) such as traumatic spinal cord injury (SCI). This dissertation is focused on autophagy in OL development and the OL translatome after SCI. Chapter I describes the history of myelin, OL development, and their involvement in neurodegenerative diseases and SCI. The proteostasis network, in particular autophagy, and its contributions to white matter pathology is discussed. It concludes examining advantages and disadvantages of unbiased omics tools, like RiboTag, to study transcriptional/translational landscapes after SCI. Chapter II focuses on autophagy in OPC/OL differentiation, survival, and proper myelination in …


Clustering And Analysis Of G Quadruplex Sequences., Aryan Neupane May 2023

Clustering And Analysis Of G Quadruplex Sequences., Aryan Neupane

Electronic Theses and Dissertations

G quadruplex structures are secondary structures located throughout the genome of various organisms with involvement in regulatory functions in different transcription, translation, genome stability, epigenetic regulation as well as cell division. Even with the diverse acknowledgement of G4 structure in vivo, there are no current search tools for G quadruplexes based on already identified G quadruplexes and identified families across different genomes based on sequence diversity. Construction of families of G4 sequences and identifying their polymorphisms within disease and disorders will lead to a better understanding of their functional roles and will further research into the biophysical modeling of interactions …


Discovering The Pathways And Go Terms Associated With Mettl3 Modified Circular Rnas In The Embryonic Cerebral Cortex Of Mice., Dunia Zedan May 2022

Discovering The Pathways And Go Terms Associated With Mettl3 Modified Circular Rnas In The Embryonic Cerebral Cortex Of Mice., Dunia Zedan

Electronic Theses and Dissertations

Circular RNAs (cirRNAs) are a class of RNA molecules that result from the alternative back-splicing events that join the 3’ and 5’ ends normally present in the linear RNA molecules. It has been published that cirRNAs can function as gene regulators and as “microRNA sponges” to negatively control the functions of microRNAs. While many studies have been conducted to understand the regulatory roles of Mettl3 in linear messenger RNAs, fewer contributions were applied to understand the impact of Mettl3 modified cirRNAs on gene expression and on the regulation of different KEGG biological pathways and GO terms. This thesis was conducted …


Genomic Tools And Models For Investigating The Role Of Germline Diversity In Mouse Antibody Repertoire Development., Justin T. Kos May 2022

Genomic Tools And Models For Investigating The Role Of Germline Diversity In Mouse Antibody Repertoire Development., Justin T. Kos

Electronic Theses and Dissertations

Given the diversity and complexity within immunoglobulin (IG) loci, effective mouse models first require characterization of intra-strain differences and construction of high-quality reference assemblies for IG loci in several representative strains. To understand light chain germline diversity across biomedically significant mouse strains, we profiled the expressed IGK and IGL repertoires of 18 commonly used laboratory mouse strains using AIRR-seq. Across strains, we observed germline IGKV sequences shared by three different IGK haplotypes and a more conserved IGLV germline repertoire among common laboratory strains. Pacific Biosciences (PacBio) Single-Molecule Real-Time (SMRT) sequencing was used to sequence and assemble bacterial artificial chromosomes (BAC) …


Computational And Biochemical Characterizations Of Anhydrobiosis-Related Intrinsically Disordered Proteins., Brett R. Janis Dec 2021

Computational And Biochemical Characterizations Of Anhydrobiosis-Related Intrinsically Disordered Proteins., Brett R. Janis

Electronic Theses and Dissertations

Anhydrobiosis is the remarkable phenomenon of “life without water”. It is a common technique found in plant seeds, and a rare technique utilized by some animals to temporarily stop the clock of life and enter a stasis for up to several millennia by removing all of their cellular water. If this phenomenon can be replicated, then biological and medical materials could be stored at ambient temperatures for centuries, which would address research challenges as well as enhance the availability of medicine in areas of the world where refrigeration, freezing, and cold-chain infrastructure are not developed or infeasible. Furthermore, modifying crop …


A Method For Identifying Ancient Introgression Between Caballine And Non-Caballine Equids Using Whole Genome High Throughput Data., Kalpani De Silva Dec 2021

A Method For Identifying Ancient Introgression Between Caballine And Non-Caballine Equids Using Whole Genome High Throughput Data., Kalpani De Silva

Electronic Theses and Dissertations

Introgression is one of the main mechanisms that transfer adapted alleles between species. The advantageous variants will get positively selected and retained in the recipient population while rest of the variants undergo negative selection. When analyzing horse genome, two alleles were found in CXCL16 gene, one associated with susceptibility and one with resistance to developing persistent shedding of the Equine Arteritis Virus. The two alleles differ by 4 non-synonymous variants in exon 1 of the gene. Comparison with 3 non-caballine equids (zebras, asses and hemiones) revealed that one haplotype was almost identical to the haplotype found in non-caballines while the …


Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das Dec 2020

Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das

Electronic Theses and Dissertations

Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …


Modified-Half-Normal Distribution And Different Methods To Estimate Average Treatment Effect., Jingchao Sun Dec 2020

Modified-Half-Normal Distribution And Different Methods To Estimate Average Treatment Effect., Jingchao Sun

Electronic Theses and Dissertations

This dissertation consists of three projects related to Modified-Half-Normal distribution and causal inference. In my first project, a new distribution called Modified-Half-Normal distribution was introduced. I explored a few of its distributional properties, the procedures for generating random samples based on Bayesian approaches, and the parameter estimation based on the method of moments. The second project deals with the problem of selection bias of average treatment effect (ATE) if we use the observational data. I combined the propensity score based inverse probability of treatment weighting (IPTW) method and the directed acyclic graph (DAG) to solve this problem. The third project …


The Identification Of Long Non-Coding Rna Zfas1 Through An Exploratory Rna-Sequencing Analysis And Its Association With Epithelial-To-Mesenchymal Transition In Colon Cancer Adenocarcinoma., Stephen J. O'Brien Dec 2019

The Identification Of Long Non-Coding Rna Zfas1 Through An Exploratory Rna-Sequencing Analysis And Its Association With Epithelial-To-Mesenchymal Transition In Colon Cancer Adenocarcinoma., Stephen J. O'Brien

Electronic Theses and Dissertations

Colorectal adenocarcinoma is the fourth most common cancer diagnosed worldwide and is a significant cause of morbidity and mortality. This dissertation performed an exploratory RNA-sequencing analysis comparing gene expression between colon adenocarcinoma tissue and paired normal colon epithelium. After identification of a number of lncRNAs that were increased in expression in colon adenocarcinoma compared to normal colon epithelium, we aimed to validate the expression and investigate their function in vitro. Specifically, we focused on the lncRNA ZFAS1 and its association with epithelial-to-mesenchymal transition. These studies found the following: 1. Seven candidate lncRNAs were identified from the exploratory RNA-sequencing analysis to …


Designing And Sample Size Calculation In Presence Of Heterogeneity In Biological Studies Involving High-Throughput Data., Sudhir Srivastava Aug 2019

Designing And Sample Size Calculation In Presence Of Heterogeneity In Biological Studies Involving High-Throughput Data., Sudhir Srivastava

Electronic Theses and Dissertations

The designing and determination of sample size are important for conducting high-throughput biological experiments such as proteomics experiments and RNA-Seq expression studies, thus leading to better understanding of complex mechanisms underlying various biological processes. The variations in the biological data or technical approaches to data collection lead to heterogeneity for the samples under study. We critically worked on the issues of technical and biological heterogeneity. The quantitative measurements based on liquid chromatography (LC) coupled with mass spectrometry (MS) often suffer from the problem of missing values (MVs) and data heterogeneity. We considered a proteomics data set generated from human kidney …


Functional Consequence Of Psat1 Association On Pkm2'S Inherent Enzymatic Activity., Alexis Avidan Vega Dec 2018

Functional Consequence Of Psat1 Association On Pkm2'S Inherent Enzymatic Activity., Alexis Avidan Vega

Electronic Theses and Dissertations

Pyruvate kinase M2 (PKM2) is predominantly found in tumors, where it allows the cancer cell to adapt to metabolic conditions through allosteric regulation of its activity. We recently discovered that phosphoserine aminotransferase 1 (PSAT1) associates with and activates PKM2. Here, I sought to affirm PSAT1's ability to increase PKM2 activity through kinetic and association studies of wild-type or mutant PKM2 enzymes. I demonstrate that His-tagged WT and mutant PKM2 enzymes are active, exhibit different kinetics, yet cannot be activated by PSAT1. Comparison studies using untagged WT PKM2 suggest that inclusion of the His-tag disrupts PSAT1 association. In support, pull-down strategies …


Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor Aug 2018

Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor

Electronic Theses and Dissertations

Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …


Deciphering The Role Of Human Arylamine N-Acetyltransferase 1 (Nat1) In Breast Cancer Cell Metabolism Using A Systems Biology Approach., Samantha Marie Carlisle Aug 2018

Deciphering The Role Of Human Arylamine N-Acetyltransferase 1 (Nat1) In Breast Cancer Cell Metabolism Using A Systems Biology Approach., Samantha Marie Carlisle

Electronic Theses and Dissertations

Background: Human arylamine N-acetyltransferase 1 (NAT1) is a phase II xenobiotic metabolizing enzyme found in almost all tissues. NAT1 can additionally hydrolyze acetyl-coenzyme A (acetyl-CoA) in the absence of an arylamine substrate. NAT1 expression varies inter-individually and is elevated in several cancers including estrogen receptor positive (ER+) breast cancers. Additionally, multiple studies have shown the knockdown of NAT1, by both small molecule inhibition and siRNA methods, in breast cancer cells leads to decreased invasive ability and proliferation and decreased anchorage-independent colony formation. However, the exact mechanism by which NAT1 expression affects cancer risk and progression remains unclear. Additionally, consequences …


Region Based Gene Expression Via Reanalysis Of Publicly Available Microarray Data Sets., Ernur Saka May 2018

Region Based Gene Expression Via Reanalysis Of Publicly Available Microarray Data Sets., Ernur Saka

Electronic Theses and Dissertations

A DNA microarray is a high-throughput technology used to identify relative gene expression. One of the most widely used platforms is the Affymetrix® GeneChip® technology which detects gene expression levels based on probe sets composed of a set of twenty-five nucleotide probes designed to hybridize with specific gene targets. Given a particular Affymetrix® GeneChip® platform, the design of the probes is fixed. However, the method of analysis is dynamic in nature due to the ability to annotate and group probes into uniquely defined groupings. This is particularly important since publicly available repositories of microarray datasets, such as ArrayExpress and NCBI’s …


Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick Dec 2017

Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick

Electronic Theses and Dissertations

Introduction: Differential scanning calorimetry (DSC) is used to determine thermally-induced conformational changes of biomolecules within a blood plasma sample. Recent research has indicated that DSC curves (or thermograms) may have different characteristics based on disease status and, thus, may be useful as a monitoring and diagnostic tool for some diseases. Since thermograms are curves measured over a range of temperature values, they are often considered as functional data. In this dissertation we propose and apply functional data analysis (FDA) techniques to analyze DSC data from the Lupus Family Registry and Repository (LFRR). The aim is to develop FDA methods to …


Algorithms For Automated Assignment Of Solution-State And Solid-State Protein Nmr Spectra., Andrey Smelter Aug 2017

Algorithms For Automated Assignment Of Solution-State And Solid-State Protein Nmr Spectra., Andrey Smelter

Electronic Theses and Dissertations

Protein nuclear magnetic resonance spectroscopy (Protein NMR) is an invaluable analytical technique for studying protein structure, function, and dynamics. There are two major types of NMR spectroscopy that are used for investigation of protein structure – solution-state and solid-state NMR. Solution-based NMR spectroscopy is typically applied to proteins of small and medium size that are soluble in water. Solid-state NMR spectroscopy is amenable for proteins that are insoluble in water. In the vast majority NMR-based protein studies, the first step after experiment optimization is the assignment of protein resonances via the association of chemical shift values to specific atoms in …


Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah May 2017

Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah

Electronic Theses and Dissertations

Despite considerable advances in high throughput technology over the last decade, new challenges have emerged related to the analysis, interpretation, and integration of high-dimensional data. The arrival of omics datasets has contributed to the rapid improvement of systems biology, which seeks the understanding of complex biological systems. Metabolomics is an emerging omics field, where mass spectrometry technologies generate high dimensional datasets. As advances in this area are progressing, the need for better analysis methods to provide correct and adequate results are required. While in other omics sectors such as genomics or proteomics there has and continues to be critical understanding …


Structure-Function Analysis And Characterization Of Metalloproteins., Sen Yao Aug 2016

Structure-Function Analysis And Characterization Of Metalloproteins., Sen Yao

Electronic Theses and Dissertations

Metalloproteins are proteins that can bind at least one metal ion as a cofactor. They utilize metal ions for a variety of biological purposes, and are essential for all domains of life. Due to the ubiquity of metalloprotein’s involvement across these processes across all domains of life, how proteins coordinate metal ions for different biochemical functions is of great relevance to understanding the implementation of these biological processes. One of the most important aspects of metal binding is its coordination geometry (CG), which often implies functional activities. Most of the current studies are based on the assumption of previously reported …


Integrated Analysis Of Mirna/Mrna Expression And Gene Methylation Using Sparse Canonical Correlation Analysis., Dake Yang May 2016

Integrated Analysis Of Mirna/Mrna Expression And Gene Methylation Using Sparse Canonical Correlation Analysis., Dake Yang

Electronic Theses and Dissertations

MicroRNAs (miRNAs) are a large number of small endogenous non-coding RNA molecules (18-25 nucleotides in length) which regulate expression of genes post-transcriptionally. While a variety of algorithms exist for determining the targets of miRNAs, they are generally based on sequence information and frequently produce lists consisting of thousands of genes. Canonical correlation analysis (CCA) is a multivariate statistical method that can be used to find linear relationships between two data sets, and here we apply CCA to find the linear combination of differentially expressed miRNAs and their corresponding target genes having maximal negative correlation. Due to the high dimensionality, sparse …


Expression Of Genes For Peptide/Protein Hormones And Their Cognate Receptors In Breast Carcinomas As Biomarkers Predicting Risk Of Recurrence., Michael Wesley Daniels May 2016

Expression Of Genes For Peptide/Protein Hormones And Their Cognate Receptors In Breast Carcinomas As Biomarkers Predicting Risk Of Recurrence., Michael Wesley Daniels

Electronic Theses and Dissertations

Certain hormones and/or receptors influencing normal cellular pathways were detected in breast cancers. The hypothesis is that gene subsets predict risk of breast carcinoma recurrence in patients with primary disease. Gene expression of 55 hormones and 73 receptors were determined by microarray with LCM-procured carcinoma cells of 247 de-identified biopsies. Univariate and multivariate Cox regressions were determined using expression levels of each hormone/receptor gene, individually or as a pair. Significant genes derived for each subset were analyzed to predict risk of cancer recurrence with 1000 LASSO training/test sets. A 14-gene molecular signature was identified for predicting clinical outcome without regard …


Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula May 2015

Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula

Electronic Theses and Dissertations

Determining the best clustering algorithm and ideal number of clusters for a particular dataset is a fundamental difficulty in unsupervised clustering analysis. In biological research, data generated from Next Generation Sequencing technology and microarray gene expression data are becoming more and more common, so new tools and resources are needed to group such high dimensional data using clustering analysis. Different clustering algorithms can group data very differently. Therefore, there is a need to determine the best groupings in a given dataset using the most suitable clustering algorithm for that data. This paper presents the R package optCluster as an efficient …


Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990- May 2015

Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990-

Electronic Theses and Dissertations

The research conducted for this thesis was performed to summarize some of the most commonly used survival analysis techniques as well as to create one macro that will provide the solutions for these techniques. Some of the techniques that this thesis focuses on are survival and hazard functions, mean and median survival times, life table, log rank test, proportional hazards/model building, and competing risk. To further analyze these survival analysis techniques I will use the Bone Marrow Transplantation for Leukemia dataset. This trial consists of either acute myelocytic leukemia (AML 99 patients) or acute lymphoblastic leukemia (ALL 38 patients). There …


Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984- May 2014

Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984-

Electronic Theses and Dissertations

Though randomized clinical (RCTs) trials are the gold standard for comparing treatments, they are often infeasible or exclude clinically important subjects, or generally represent an idealized medical setting rather than real practice. Observational data provide an opportunity to study practice-based evidence, but also present challenges for analysis. Traditional statistical methods which are suitable for RCTs may be inadequate for the observational studies. In this project, four of the most popular statistical methods for observational studies: ANCOVA, propensity score matching, regression with the propensity score as a covariate, and instrumental variables (IV) are investigated through application to MarketScan insurance claims data. …


Compound Identification Using Penalized Linear Regression., Ruiqi Liu May 2013

Compound Identification Using Penalized Linear Regression., Ruiqi Liu

Electronic Theses and Dissertations

In this study, we propose a new method for compound identification using penalized linear regression. Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. In the context of the linear regression, the response variable is an experimental mass spectrum (i.e., query) and all the compounds in the reference library are the independent variables. However, the number of compounds in the reference library is much larger than the range of m/z values so that the data become high dimensional data with suffering from singularity. For …