Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Bioinformatics

Structural Analysis Of Predicted Proteins Using Alphafold, Brydon P. Wall Jan 2023

Structural Analysis Of Predicted Proteins Using Alphafold, Brydon P. Wall

Undergraduate Research Posters

The function of around 67% of predicted proteins from genes in Mycobacteriophage CheetoDust can not be confidently predicted using traditional techniques and can only be functionally labeled “hypothetical proteins”. However, a new approach using AlphaFold, an artificial intelligence tool to generate a structural prediction from a sequence, can take advantage of structurally conserved regions that were previously obfuscated to gain new insights and visualize data in new ways.

Since amino acid sequences are more conserved than its corresponding DNA sequence, amino acid sequences are used when predicting the function of the corresponding translated protein. Until recently, predicting structure from an …


Modeling Electrostatics In Molecular Biology And Its Relevance With Molecular Mechanisms Of Diseases, Mahesh Koirala Aug 2022

Modeling Electrostatics In Molecular Biology And Its Relevance With Molecular Mechanisms Of Diseases, Mahesh Koirala

All Dissertations

Electrostatics plays an essential role in molecular biology. Modeling electrostatics in molecular biology is complicated due to the water phase, mobile ions, and irregularly shaped inhomogeneous biological macromolecules. This dissertation presents the popular DelPhi package that solves PBE and delivers the electrostatic potential distribution of biomolecules. We used the newly developed DelPhiForce steered Molecular Dynamics (DFMD) approach to model the binding of barstar to barnase and demonstrated that the first-principles method could also model the binding. This dissertation also reflects the use of existing computational approaches to model the effects of Single Amino Acid Variations (SAVs) to reveal molecular mechanisms …


In Silico Characterization Of Protein-Protein Interactions Mediated By Short Linear Motifs, Heidy Elkhaligy Jun 2022

In Silico Characterization Of Protein-Protein Interactions Mediated By Short Linear Motifs, Heidy Elkhaligy

FIU Electronic Theses and Dissertations

Short linear motifs (SLiMs), often found in intrinsically disordered regions (IDPs), can initiate protein-protein interactions in eukaryotes. Although pathogens tend to have less disorder than eukaryotes, their proteins alter host cellular function through molecular mimicry of SLiMs. The first objective was to study sequence-based structure properties of viral SLiMs in the ELM database and the conservation of selected viral motifs involved in the virus life cycle. The second objective was to compare the structural features for SliMs in pathogens and eukaryotes in the ELM database. Our analysis showed that many viral SliMs are not found in IDPs, particularly glycosylation motifs. …


Unveiling Global Roles Of G-Quadruplexes And G4-22 In Human Genetics, Ruth Barros De Paula Aug 2021

Unveiling Global Roles Of G-Quadruplexes And G4-22 In Human Genetics, Ruth Barros De Paula

Dissertations & Theses (Open Access)

G-quadruplexes are non-B DNA structures formed by four or more runs of repeated guanines that confer unique features to living organism’s genomes. These sequences are enriched in regulatory regions, such as promoters and 5’ UTRs, and have distinct regulatory roles in both health and disease states. Even though previous studies showed the impact of G4 in gene expression, none of them summarized the location-specific effect of G4. Also, there is no broad understanding about the most common G4 repeat in the human genome, named here as G4-22, and how it links to the evolution of mammals and their biology. In …


Computational Analysis And Prediction Of Intrinsic Disorder And Intrinsic Disorder Functions In Proteins, Akila I. Katuwawala Jan 2021

Computational Analysis And Prediction Of Intrinsic Disorder And Intrinsic Disorder Functions In Proteins, Akila I. Katuwawala

Theses and Dissertations

COMPUTATIONAL ANALYSIS AND PREDICTION OF INTRINSIC DISORDER AND INTRINSIC DISORDER FUNCTIONS IN PROTEINS

By Akila Imesha Katuwawala

A dissertation submitted in partial fulfillment of the requirements for the degree of Engineering, Doctor of Philosophy with a concentration in Computer Science at Virginia Commonwealth University.

Virginia Commonwealth University, 2021

Director: Lukasz Kurgan, Professor, Department of Computer Science

Proteins, as a fundamental class of biomolecules, have been studied from various perspectives over the past two centuries. The traditional notion is that proteins require fixed and stable three-dimensional structures to carry out biological functions. However, there is mounting evidence regarding a “special” class …


Development Of Computational Tools To Target Microrna, Luo Song Dec 2020

Development Of Computational Tools To Target Microrna, Luo Song

Dissertations & Theses (Open Access)

MicroRNAs (a.k.a, miRNAs) play an important role in disease development. However, few of their structures have been determined and structure-based computational methods remain challenging in accurately predicting their interactions with small molecules. To address this issue, my thesis is to develop integrated approaches to screening for novel inhibitors by targeting specific structure motifs in miRNAs. The project starts with implementing a tool to find potential miRNA targets with desired motifs. I combined both sequence information of miRNAs and known RNA structure data from Protein Data Bank (PDB) to predict the miRNA structure and identify the motif to target, then I …


New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger Nov 2020

New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger

Theses

Background: Much of the recent success in protein structure prediction has been a result of accurate protein contact prediction--a binary classification problem. Dozens of methods, built from various types of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups, including Google DeepMind, have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As an alternative approach, we recently proposed real-valued distance predictions, formulating the problem as a regression problem. The nuances of protein 3D structures make this formulation appropriate, allowing predictions …


Copy Number Variation In The Porcine Genome Detected From Whole-Genome Sequence, Rebecca Anderson Mar 2018

Copy Number Variation In The Porcine Genome Detected From Whole-Genome Sequence, Rebecca Anderson

Honors Theses

Copy number variations (CNVs) are large insertions, deletions, and duplications in the genome that vary between individuals in a species. These variations are known to impact a broad range of phenotypes from molecular-level traits to higher-order clinical phenotypes. CNVs have been linked to complex traits in humans such as autism, attention deficit hyperactivity disorder, nervous system disorders, and early-onset extreme obesity. In this study, whole-genome sequence was obtained from 72 founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC) in Clay Center, Nebraska. This included 24 boars (12 Duroc and 12 Landrace) and …


Mrub_1325, Mrub_1326, Mrub_1327, And Mrub_1328 Are Orthologs Of B_3454, B_3455, B_3457, B_3458, Respectively Found In Escherichia Coli Coding For A Branched Chain Amino Acid Atp Binding Cassette (Abc) Transporter System, Bennett Tomlin, Adam Buric, Dr. Lori Scott Jan 2018

Mrub_1325, Mrub_1326, Mrub_1327, And Mrub_1328 Are Orthologs Of B_3454, B_3455, B_3457, B_3458, Respectively Found In Escherichia Coli Coding For A Branched Chain Amino Acid Atp Binding Cassette (Abc) Transporter System, Bennett Tomlin, Adam Buric, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

In this project we investigated the biological function of the genes Mrub_1325, Mrub_1326, Mrub_1327, and Mrub_1328 (KEGG map number 02010). We predict these genes encode components of a Branched Chain Amino Acid ATP Binding Cassette (ABC) transporter: 1) Mrub_1325 (DNA coordinates 1357399-1358130 on the reverse strand) encodes the ATP binding domain; 2) Mrub_1326 (DNA coordinates 1358127-1359899 on the reverse strand) encodes the ATP-binding domain and permease domain; 3) Mrub_1327 (DNA coordinates 1359899-1360930 on the reverse strand) encodes a permease domain; and 4)Mrub_1328 (DNA coordinates 1711022-1712185 on the reverse strand) encodes the substrate binding domain. This system is not predicted to …


Confirmation That Mrub_1751 Is Homologous To E. Coli Xylf, Mrub_1752 Is Homologous To E. Coli Xylh, And Mrub_1753 Is Homologous To E. Coli Xylg, Ben Price, Dr. Lori Scott Jan 2018

Confirmation That Mrub_1751 Is Homologous To E. Coli Xylf, Mrub_1752 Is Homologous To E. Coli Xylh, And Mrub_1753 Is Homologous To E. Coli Xylg, Ben Price, Dr. Lori Scott

Meiothermus ruber Genome Analysis Project

In this project we investigated the biological function of the genes Mrub_1751, Mrub_1752 and Mrub_1753 (KEGG map number 02010). We predict these genes encode components of a D-xylose ATP Binding Cassette (ABC) transporter: 1) Mrub_1752 (DNA coordinates 1809004-1810224 on the forward strand) encodes the permease component (aka transmembrane domain), predicted to be an ortholog and 2) Mrub_1753 (DNA coordinates 1810227-1811000 on the forward strand) encodes the ATP-binding domain (aka nucleotide binding domain); and 3) Mrub_1751 (DNA coordinates 1807855-1808892 on the forward strand) encodes the solute binding protein. The ABC-transporter for M. ruber to transport D-xylose is homologous with the transporter …


A Pipeline For Creation Of Genome-Scale Metabolic Reconstructions, Shaun W. Norris Jan 2016

A Pipeline For Creation Of Genome-Scale Metabolic Reconstructions, Shaun W. Norris

Theses and Dissertations

The decreasing costs of next generation sequencing technologies and the increasing speeds at which they work have lead to an abundance of 'omic datasets. The need for tools and methods to analyze, annotate, and model these datasets to better understand biological systems is growing. Here we present a novel software pipeline to reconstruct the metabolic model of an organism in silico starting from its genome sequence and a novel compilation of biological databases to better serve the generation of metabolic models. We validate these methods using five Gardnerella vaginalis strains and compare the gene annotation results to NCBI and the …


Comparative Genomics Of Microbial Chemoreceptor Sequence, Structure, And Function, Aaron Daniel Fleetwood Dec 2014

Comparative Genomics Of Microbial Chemoreceptor Sequence, Structure, And Function, Aaron Daniel Fleetwood

Doctoral Dissertations

Microbial chemotaxis receptors (chemoreceptors) are complex proteins that sense the external environment and signal for flagella-mediated motility, serving as the GPS of the cell. In order to sense a myriad of physicochemical signals and adapt to diverse environmental niches, sensory regions of chemoreceptors are frenetically duplicated, mutated, or lost. Conversely, the chemoreceptor signaling region is a highly conserved protein domain. Extreme conservation of this domain is necessary because it determines very specific helical secondary, tertiary, and quaternary structures of the protein while simultaneously choreographing a network of interactions with the adaptor protein CheW and the histidine kinase CheA. This dichotomous …


Utilizing Nmr Spectroscopy And Molecular Docking As Tools For The Structural Determination And Functional Annotation Of Proteins, Jaime Stark Feb 2013

Utilizing Nmr Spectroscopy And Molecular Docking As Tools For The Structural Determination And Functional Annotation Of Proteins, Jaime Stark

Department of Chemistry: Dissertations, Theses, and Student Research

With the completion of the Human Genome Project in 2001 and the subsequent explosion of organisms with sequenced genomes, we are now aware of nearly 28 million proteins. Determining the role of each of these proteins is essential to our understanding of biology and the development of medical advances. Unfortunately, the experimental approaches to determine protein function are too slow to investigate every protein. Bioinformatics approaches, such as sequence and structure homology, have helped to annotate the functions of many similar proteins. However, despite these computational approaches, approximately 40% of proteins still have no known function. Alleviating this deficit will …


Hivtoolbox, An Integrated Web Application For Investigating Hiv, David P. Sargeant, Sandeep Deverasetty, Yang Luo, Angel Villahoz Baleta, Stephanie Zobrist, Viraj Rathnayake, Jacqueline C. Russo, Jay Vyas, Mark A. Muesing, Martin Schiller May 2011

Hivtoolbox, An Integrated Web Application For Investigating Hiv, David P. Sargeant, Sandeep Deverasetty, Yang Luo, Angel Villahoz Baleta, Stephanie Zobrist, Viraj Rathnayake, Jacqueline C. Russo, Jay Vyas, Mark A. Muesing, Martin Schiller

Life Sciences Faculty Research

Many bioinformatic databases and applications focus on a limited domain of knowledge federating links to information in other databases. This segregated data structure likely limits our ability to investigate and understand complex biological systems. To facilitate research, therefore, we have built HIVToolbox, which integrates much of the knowledge about HIV proteins and allows virologists and structural biologists to access sequence, structure, and functional relationships in an intuitive web application. HIV-1 integrase protein was used as a case study to show the utility of this application. We show how data integration facilitates identification of new questions and hypotheses much more rapid …