Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 76

Full-Text Articles in Computational Biology

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone Nov 2023

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone

Complex Biosystems PhD Program: Dissertations

The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …


Predicting Marine Teleost Responses To Ocean Warming And Pollution, Akila Harishchandra Aug 2023

Predicting Marine Teleost Responses To Ocean Warming And Pollution, Akila Harishchandra

Electronic Theses and Dissertations

Ocean warming and pollution are two detrimental anthropogenic factors causing rapid marine ecosystem degradation recorded in the past decades. These factors alter the marine environment intolerable for many marine species, forcing them to either adapt or shift their contemporary habitat ranges to reduce the extinction risk embedded with environmental degradation. Estimating marine species’ habitat range shifts, and their potential for developing adaptive mechanisms are critical for ecosystem conservation and management, human health risk assessment, and climate change vulnerability assessments. Given that, for the first chapter of this thesis, we focused on developing a species distribution model (SDM) integrating marine species …


Exploration Of The Immune Landscape Of Ebv-Associated Gastric Cancers, Mikhail Salnikov Jun 2023

Exploration Of The Immune Landscape Of Ebv-Associated Gastric Cancers, Mikhail Salnikov

Electronic Thesis and Dissertation Repository

Epstein–Barr virus (EBV) is a gammaherpesvirus associated with 9% of all gastric cancers (GCs). EBV-associated GCs (EBVaGCs) are pathologically and clinically distinct entities from EBV-negative GCs (EBVnGCs), with EBVaGCs exhibiting differential molecular pathology and patient prognosis. The purpose of this thesis is to investigate the tumor microenvironment (TME) of EBVaGCs, which has not been explored in-depth. We hypothesize that EBVaGCs and EBVnGCs are also distinct in terms of the molecular immune landscape. We employed over 400 stomach adenocarcinoma (STAD) samples from The Cancer Genome Atlas (TCGA), as well as a single cell dataset, for the construction of a web suite …


Mining Sars-Cov-2 Phylogenetic Trees To Estimate Circulating Infections And Patterns Of Migration, Erin V. Brintnell Jun 2023

Mining Sars-Cov-2 Phylogenetic Trees To Estimate Circulating Infections And Patterns Of Migration, Erin V. Brintnell

Electronic Thesis and Dissertation Repository

The SARS-CoV-2 pandemic led to the formation of very large databases of genomic viral data. These databases contain information on transmission dynamics, emergence and evolution of SARS-CoV-2. However, extracting this information from sequences is difficult, as most methods of analyzing viral genomes were developed for smaller data sets. Therefore, my objective was to develop new fast estimators of the number of infections (I) and the rate of migration based on simple features of SARS-CoV-2 phylogenies.

I simulated pathogen evolution using a susceptible-exposed-infectious-recovered (SEIR) model of pathogen spread, reconstructing evolution using CoVizu. For simulations of I, I varied the total number …


Integrating Omim And Intact Data For The Analysis Of Gene-Phenotype Interactions In Complex Diseases: A Linux-Based Computational Tool For Network Analysis, Devin Keane May 2023

Integrating Omim And Intact Data For The Analysis Of Gene-Phenotype Interactions In Complex Diseases: A Linux-Based Computational Tool For Network Analysis, Devin Keane

All Theses

The field of genetics is constantly evolving. New advances in bioinformatics and computational approaches are leading to exciting new developments in our ability to treat and prevent diseases. Computational genetics provides valuable insights into the complex mechanisms and layers of biological communication that shape an organism's phenotype. Understanding these mechanisms is critical to advancing human health.

The study of diseases in genetics requires a comprehensive understanding of the interactions between various biological processes, including gene expression, protein synthesis, RNA, metabolism, and cell-cell communication. To effectively address the root causes of such diseases, multi-disciplinary approaches that integrate information from different levels …


The Genomics Of Autism-Related Genes Il1rapl1 And Il1rapl2: Insights Into Their Cortical Distribution, Cell-Type Specificity, And Developmental Trajectories, Jacob Weaver Apr 2023

The Genomics Of Autism-Related Genes Il1rapl1 And Il1rapl2: Insights Into Their Cortical Distribution, Cell-Type Specificity, And Developmental Trajectories, Jacob Weaver

MUSC Theses and Dissertations

Neuropsychiatric disorders have a significant impact on modern society. These disorders affect a large percentage of the population: schizophrenia has a world-wide prevalence of 1% and autism spectrum disorders (ASD) affects 1 in 59 school-aged children in the US. There is substantial evidence that most neuropsychiatric disorders have a genetic component. Thus, with the advent of high throughput sequencing much effort has gone into identifying genetic variants associated with these disorders. The emerging picture from these studies is a complex one where hundreds of genes with small effects interact with a varied landscape of common variants to result in disease. …


Towards More Complete Metagenomic Analyses Through Circularized Genomes And Conjugative Elements, Benjamin R. Joris Aug 2022

Towards More Complete Metagenomic Analyses Through Circularized Genomes And Conjugative Elements, Benjamin R. Joris

Electronic Thesis and Dissertation Repository

Advancements in sequencing technologies have revolutionized biological sciences and led to the emergence of a number of fields of research. One such field of research is metagenomics, which is the study of the genomic content of complex communities of bacteria. The goal of this thesis was to contribute computational methodology that can maximize the data generated in these studies and to apply these protocols human and environmental metagenomic samples.

Standard metagenomic analyses include a step for binning of assembled contigs, which has previously been shown to exclude mobile genetic elements, and I demonstrated that this phenomenon extends to all conjugative …


Methods And Tools To Improve Performance Of Plant Genome Analysis, Drew Ferrell Aug 2022

Methods And Tools To Improve Performance Of Plant Genome Analysis, Drew Ferrell

Theses and Dissertations

Multi -omics data analysis and integration facilitates hypothesis building toward an understanding of genes and pathway responses driven by environments. Methods designed to estimate and analyze gene expression, with regard to treatments or conditions, can be leveraged to understand gene-level responses in the cell. However, genes often interact and signal within larger structures such as pathways and networks. Complex studies guided toward describing dynamic genetic pathways and networks require algorithms or methods designed for inference based on gene interactions and related topologies. Classes of algorithms and methods may be integrated into generalized workflows for comparative genomics studies, as multi -omics …


Modeling Electrostatics In Molecular Biology And Its Relevance With Molecular Mechanisms Of Diseases, Mahesh Koirala Aug 2022

Modeling Electrostatics In Molecular Biology And Its Relevance With Molecular Mechanisms Of Diseases, Mahesh Koirala

All Dissertations

Electrostatics plays an essential role in molecular biology. Modeling electrostatics in molecular biology is complicated due to the water phase, mobile ions, and irregularly shaped inhomogeneous biological macromolecules. This dissertation presents the popular DelPhi package that solves PBE and delivers the electrostatic potential distribution of biomolecules. We used the newly developed DelPhiForce steered Molecular Dynamics (DFMD) approach to model the binding of barstar to barnase and demonstrated that the first-principles method could also model the binding. This dissertation also reflects the use of existing computational approaches to model the effects of Single Amino Acid Variations (SAVs) to reveal molecular mechanisms …


In Silico Characterization Of Protein-Protein Interactions Mediated By Short Linear Motifs, Heidy Elkhaligy Jun 2022

In Silico Characterization Of Protein-Protein Interactions Mediated By Short Linear Motifs, Heidy Elkhaligy

FIU Electronic Theses and Dissertations

Short linear motifs (SLiMs), often found in intrinsically disordered regions (IDPs), can initiate protein-protein interactions in eukaryotes. Although pathogens tend to have less disorder than eukaryotes, their proteins alter host cellular function through molecular mimicry of SLiMs. The first objective was to study sequence-based structure properties of viral SLiMs in the ELM database and the conservation of selected viral motifs involved in the virus life cycle. The second objective was to compare the structural features for SliMs in pathogens and eukaryotes in the ELM database. Our analysis showed that many viral SliMs are not found in IDPs, particularly glycosylation motifs. …


Characterizing Endogenous Dicer Products To Unravel Novel Rnai Biogenesis Pathways, Jacob Oche Peter Jun 2022

Characterizing Endogenous Dicer Products To Unravel Novel Rnai Biogenesis Pathways, Jacob Oche Peter

Dissertations

ABSTRACT

RNA interference (RNAi) is a pervasive gene regulatory mechanism in eukaryotes based on the action of multiple classes of small RNA (sRNA). Exploiting RNAi pathways in non-model systems have great potential for creating potent RNAi technologies. Here, we accessed RNAi-mediated control of gene expression in the two-spotted spider mite, Tetranychus urticae (T. urticae) using engineered dsRNA designed to modulate the host RNAi pathway and increase RNAi efficacy. Analysis of Dicer (Dcr) generated fragments revealed how exogenous RNAs access the host RNAi pathway in this animal, opening avenues for designing RNAi technology for their control. Further, some organisms …


Comparative Analyses Of De Novo Transcriptome Assembly Pipelines For Diploid Wheat, Natasha Pavlovikj May 2022

Comparative Analyses Of De Novo Transcriptome Assembly Pipelines For Diploid Wheat, Natasha Pavlovikj

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Gene expression and transcriptome analysis are currently one of the main focuses of research for a great number of scientists. However, the assembly of raw sequence data to obtain a draft transcriptome of an organism is a complex multi-stage process usually composed of pre-processing, assembling, and post-processing. Each of these stages includes multiple steps such as data cleaning, error correction and assembly validation. Different combinations of steps, as well as different computational methods for the same step, generate transcriptome assemblies with different accuracy. Thus, using a combination that generates more accurate assemblies is crucial for any novel biological discoveries. Implementing …


An Investigation Of Epigenetic Mechanisms Driving The Biology Of Head And Neck Squamous Cell Carcinoma, Scot Carson Callahan May 2022

An Investigation Of Epigenetic Mechanisms Driving The Biology Of Head And Neck Squamous Cell Carcinoma, Scot Carson Callahan

Dissertations & Theses (Open Access)

Head and neck squamous cell carcinoma (HNSCC) is the 6th most common cancer worldwide and is associated with significant morbidity and mortality. To date, the majority of work in the field has focused on genomic alterations such as mutations and copy number alterations. However, the clinical success of targeted therapies that exploit known genomic alterations, such as EGFR mutations, has remained mixed. Over the past decade, the importance of epigenetic regulators has come to the forefront, with the realization that many of these genes are mutated in cancer. Despite this realization, the role of epigenetics in regulating tumorigenesis, progression and …


Alterations Of The Gut Mycobiome In Patients With Ms - A Bioinformatic Approach, Saumya Shah May 2022

Alterations Of The Gut Mycobiome In Patients With Ms - A Bioinformatic Approach, Saumya Shah

Honors Scholar Theses

The mycobiome is the fungal component of the gut microbiome and is implicated in several autoimmune diseases. However, its role in multiple sclerosis (MS) has not been studied. We performed descriptive and formal statistical tests using the R language to characterize the gut mycobiome in people with MS (pwMS) and healthy controls. We found that the microbiome composition of multiple sclerosis patients is different from healthy people. The mycobiome had significantly higher alpha diversity and inter-subject variation in pwMS than controls. Additionally, Saccharomyces and Aspergillus were over-represented in pwMS. Different mycobiome profiles, defined as mycotypes, were associated with different bacterial …


Unveiling Global Roles Of G-Quadruplexes And G4-22 In Human Genetics, Ruth Barros De Paula Aug 2021

Unveiling Global Roles Of G-Quadruplexes And G4-22 In Human Genetics, Ruth Barros De Paula

Dissertations & Theses (Open Access)

G-quadruplexes are non-B DNA structures formed by four or more runs of repeated guanines that confer unique features to living organism’s genomes. These sequences are enriched in regulatory regions, such as promoters and 5’ UTRs, and have distinct regulatory roles in both health and disease states. Even though previous studies showed the impact of G4 in gene expression, none of them summarized the location-specific effect of G4. Also, there is no broad understanding about the most common G4 repeat in the human genome, named here as G4-22, and how it links to the evolution of mammals and their biology. In …


Comparative Genomics Methods And Applications, Emily N. Alden Jul 2021

Comparative Genomics Methods And Applications, Emily N. Alden

Biomedical Sciences ETDs

Virtually all fields of biology have benefited from the advancements in comparative genomics technologies, specifically in the study of evolution. In this dissertation I develop and use comparative genomic technologies to investigate the novel SARS-CoV-2 virus, assembly the first genome of the black lace domestic angelfish and identify germline genetic variants associated with altered breast cancer-specific survival. Our genome tiling array for the novel coronavirus presents a rapid and cost-effective method to sequence the entire viral genome and can be used to track the rapid evolution of viral variants in the population. The domestic angelfish is a member of the …


Distribution And Diversity Of Heliothine And Other Lepidopteran Nudiviruses, Emrah Ozel Jan 2021

Distribution And Diversity Of Heliothine And Other Lepidopteran Nudiviruses, Emrah Ozel

Theses and Dissertations--Entomology

Helicoverpa zea nudivirus 2 (HzNV-2) is the only known sterilizing and sexually-transmitted insect virus and causes pathological symptoms in H. zea reproductive tissues. HzNV-2 has features that make it a candidate as a H. zea (corn earworm) control agent, such as the ability to cause asymptomatic (latent) and symptomatic (lytic) infections and the ability to influence mating behavior of its host to favor virus spread. HzNV pathology has been studied and its genome sequenced, however, its prevalence in natural populations is largely unknown. In this study, we developed and used a low-cost PCR-based molecular survey to investigate HzNV-2 prevalence and …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …


Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman Jan 2021

Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman

Computer Science Faculty Publications

Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the …


Composition And Homology In The Taxonomic Classification Of Escherichia Coli, Tanya Irani Jan 2021

Composition And Homology In The Taxonomic Classification Of Escherichia Coli, Tanya Irani

Theses and Dissertations (Comprehensive)

As new techniques have been introduced, specifically the possibility of complete genome sequencing, better methods of defining bacterial species have also been proposed. One of the most recently proposed methods, using bioinformatic techniques, is to calculate the average nucleotide identity (ANI) between the homologous genome segments of different isolates. Another method for species discrimination that has been tested successfully is the similarity of DNA compositional signatures. However, in a recent update, DNA signatures split the available Escherichia coli complete genomes into three groups. To check if this result was consistent with such genomes belonging to different species, we tested methods …


Investigation Of Proliferation Suppressors In Genetic Fitness Screens, Walter Frank Lenoir Iv Dec 2020

Investigation Of Proliferation Suppressors In Genetic Fitness Screens, Walter Frank Lenoir Iv

Dissertations & Theses (Open Access)

Innovation of CRISPR gene-editing technology has provided scientists genome manipulation tools that allowed rapid advancement of scientific capabilities and thus improved our ability to systematically study mammalian genetic functional profiles. Genome-wide CRISPR knockout screens conducted in collections of human cell lines can knock out genes at multiple loci, and have provided new insights into functional roles for independent genes. This method has launched massive efforts in looking across genetic backgrounds for context specific genetic vulnerabilities within cancer. Much of the research effort thus far has been spent on optimizing phenotype distinctions between essential, genes required for cell fitness, and non-essential, …


Decoding The Evolutionary Response To Prostate Cancer Therapy Using Plasma Genome Sequencing, Naveen Ramesh Dec 2020

Decoding The Evolutionary Response To Prostate Cancer Therapy Using Plasma Genome Sequencing, Naveen Ramesh

Dissertations & Theses (Open Access)

Investigating genome evolution in response to therapy is difficult in human tissue samples due to the difficulty in accessing metastatic tumor sites and logistical challenges of collecting longitudinal samples. To overcome these issues, we developed an unbiased whole-genome plasma DNA sequencing approach called PEGASUS that concurrently measures genomic copy number and exome mutations from archival cryostored plasma samples. This approach was applied to study longitudinal blood plasma samples from prostate cancer patients. A molecular characterization of archival plasma DNA from 233 patients and genomic profiling of 101 patients identified clinical correlations of aneuploid plasma DNA profiles with poor survival, increased …


Polerovirus Genomic Variation And Mechanisms Of Silencing Suppression By P0 Protein, Natalie Holste Nov 2020

Polerovirus Genomic Variation And Mechanisms Of Silencing Suppression By P0 Protein, Natalie Holste

School of Biological Sciences: Dissertations, Theses, and Student Research

The family Luteoviridae consists of three genera: Luteovirus, Enamovirus, and Polerovirus. The genus Polerovirus contains 32 virus species. All are transmitted by aphids and can infect a wide variety of crops from cereals and wheat to cucurbits and peppers. However, little is known about how this wide range of hosts and vectors developed. In poleroviruses, aphid transmission and virion formation is mediated by the coat protein read-through domain (CPRT) while silencing suppression and phloem limitation is mediated by Protein 0 (P0)—a protein unique to poleroviruses. P0 gives poleroviruses a great advantage amongst plant viruses and diversifies polerovirus species, but the …


Deciphering The Ck2-Dependent Phosphoproteome And Its Integration With Regulatory Ptm Networks, Teresa Nunez De Villavicencio Diaz Nov 2020

Deciphering The Ck2-Dependent Phosphoproteome And Its Integration With Regulatory Ptm Networks, Teresa Nunez De Villavicencio Diaz

Electronic Thesis and Dissertation Repository

Protein functions are regulated by the post-translational addition of covalent modifications on certain amino acids. Depending on their distance within the 3-dimensional structure, addition/removal of individual post translational modifications (PTMs) can be impacted by others. This PTM interplay constitutes an essential regulatory mechanism that interconnects the molecular networks in the cell. Protein CK2, a clinically relevant acidophilic Ser/Thr kinase, may be responsible for 10-20% of the human phosphoproteome. Such estimates agree with the number of known substrates, which continues to expand. Furthermore, the demonstration that CK2 participates in hierarchical phosphorylation and has similar sequence determinants to caspases suggest extensive PTM …


Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa Jun 2020

Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa

Electronic Thesis and Dissertation Repository

In the field of bioinformatics, taxonomic classification is the scientific practice of identifying, naming, and grouping of organisms based on their similarities and differences. The problem of taxonomic classification is of immense importance considering that nearly 86% of existing species on Earth and 91% of marine species remain unclassified. Due to the magnitude of the datasets, the need exists for an approach and software tool that is scalable enough to handle large datasets and can be used for rapid sequence comparison and analysis. We propose ML-DSP, a stand-alone alignment-free software tool that uses Machine Learning and Digital Signal Processing to …


The Evolution Of Bivalve Shell Matrix Proteins, Mark Ira Duhon Ii May 2020

The Evolution Of Bivalve Shell Matrix Proteins, Mark Ira Duhon Ii

LSU Doctoral Dissertations

This dissertation focuses on the molecular underpinnings surrounding the evolution of the biomineralized shells of marine bivalves. Bivalve molluscs synthesize remarkably complex shells from calcium carbonate and an organic matrix of proteins secreted from the dorsal edge of the mantle. Molecular analyses of shell matrix proteins (SMPs) have suggested high rates of gene turnover despite the conserved nature of the shell itself. Here, I used proteomic and transcriptomic data to identify the SMPs and other biomineralization proteins from seven bivalve species that diverged 3-513 Mya. Contrary to previous studies that identified only a few shared biomineralization transcripts across the Bivalvia, …


Simplicity Diffexpress: A Bespoke Cloud-Based Interface For Rna-Seq Differential Expression Modeling And Analysis, Cintia C. Palu, Marcelo Ribeiro-Alves, Yanxin Wu, Brendan Lawlor, Pavel V. Baranov, Brian Kelly, Paul Walsh May 2019

Simplicity Diffexpress: A Bespoke Cloud-Based Interface For Rna-Seq Differential Expression Modeling And Analysis, Cintia C. Palu, Marcelo Ribeiro-Alves, Yanxin Wu, Brendan Lawlor, Pavel V. Baranov, Brian Kelly, Paul Walsh

Department of Computer Science Publications

One of the key challenges for transcriptomics-based research is not only the processing of large data but also modeling the complexity of features that are sources of variation across samples, which is required for an accurate statistical analysis. Therefore, our goal is to foster access for wet lab researchers to bioinformatics tools, in order to enhance their ability to explore biological aspects and validate hypotheses with robust analysis. In this context, user-friendly interfaces can enable researchers to apply computational biology methods without requiring bioinformatics expertise. Such bespoke platforms can improve the quality of the findings by allowing the researcher to …


Computational Genomic Models For Spatio-Temporal Investigation Of Early Lung Cancer Pathology, Smruthy Sivakumar May 2019

Computational Genomic Models For Spatio-Temporal Investigation Of Early Lung Cancer Pathology, Smruthy Sivakumar

Dissertations & Theses (Open Access)

Lung cancer, of which non-small cell lung cancer (NSCLC) is the most common form, is the second most prevalent cancer and the leading cause of cancer-related deaths. NSCLCs primarily comprise adenocarcinomas (LUAD) and squamous cell carcinomas (LUSC). Advances in early detection and prevention have been limited by the lack of early-stage biomarkers and targets. A comprehensive molecular characterization of premalignant lesions and tumor-adjacent normal tissue can aid in better understanding NSCLC pathogenesis. However, these investigations are further challenged by limited tissue availability and low cellular fractions of detectable somatic mutations.

Therefore, there is a dearth of knowledge about the pathogenesis …


Mrub_3019 Casa Gene Is An Ortholog To E. Coli B2760, Kelsey Heiland, Dr. Lori Scott Feb 2019

Mrub_3019 Casa Gene Is An Ortholog To E. Coli B2760, Kelsey Heiland, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This research is part of the Meiothermus ruber genome annotation project which aims to predict gene function with various bioinformatics tools. We investigated the function of Mrub_3019, which encodes the CasA protein involved in the multi-subunit effector complex for the CRISPR-Cas immunity system and predicted it to be an ortholog of E. coli K12 MG1655 b2760 (casA). We predicted that Mrub_3019 encodes the protein CasA, which is involved in PAM recognition of CRISPR interference pathway. Foreign DNA will bind to CasA, which signals Cas3 for helicase-mediated DNA degradation. Our hypothesis is supported by low E-values for pairwise alignment in NCBI …


Mrub_3015 Is Orthologous To The B2757 Gene Found In Escherichia Coli Coding For Casd, Ramona Collins, Dr. Lori Scott Feb 2019

Mrub_3015 Is Orthologous To The B2757 Gene Found In Escherichia Coli Coding For Casd, Ramona Collins, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

This project is part of the Meiothermus ruber genome analysis project, which uses a collection of online bioinformatics tools to predict gene function. We investigated the biological function of the gene Mrub_3015, which we hypothesize is a component of the CRISPR-Cas prokaryotic defense system. We predict that Mrub_3015 (DNA coordinates 3055550...3056245) encodes the the CRISPR-associated protein cas5, which is integral in maintaining the crRNA-DNA structure, keeping the complex from base pairing with the target phage DNA. Our hypothesis is supported by identical hits for Mrub_3015 and b2527 to the KEGG, Pfam, TIGRfam, CDD and PDB databases as well as a …