Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 16 of 16

Full-Text Articles in Life Sciences

An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He Jan 2023

An Approach To Developing Benchmark Datasets For Protein Secondary Structure Segmentation From Cryo-Em Density Maps, Thu Nguyen, Yongcheng Mu, Jiangwen Sun, Jing He

Computer Science Faculty Publications

More and more deep learning approaches have been proposed to segment secondary structures from cryo-electron density maps at medium resolution range (5--10Å). Although the deep learning approaches show great potential, only a few small experimental data sets have been used to test the approaches. There is limited understanding about potential factors, in data, that affect the performance of segmentation. We propose an approach to generate data sets with desired specifications in three potential factors - the protein sequence identity, structural contents, and data quality. The approach was implemented and has generated a test set and various training sets to study …


Three-Dimensional Graph Matching To Identify Secondary Structure Correspondence Of Medium-Resolution Cryo-Em Density Maps, Bahareh Behkamal, Mahmoud Naghibzadeh, Mohammad Reza Saberi, Zeinab Amiri Tehranizadeh, Andrea Pagnani, Kamal Al Nasr Nov 2021

Three-Dimensional Graph Matching To Identify Secondary Structure Correspondence Of Medium-Resolution Cryo-Em Density Maps, Bahareh Behkamal, Mahmoud Naghibzadeh, Mohammad Reza Saberi, Zeinab Amiri Tehranizadeh, Andrea Pagnani, Kamal Al Nasr

Computer Science Faculty Research

Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4–10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such …


Computational Analysis Of Large-Scale Trends And Dynamics In Eukaryotic Protein Family Evolution, Joseph Boehm Ahrens Mar 2019

Computational Analysis Of Large-Scale Trends And Dynamics In Eukaryotic Protein Family Evolution, Joseph Boehm Ahrens

FIU Electronic Theses and Dissertations

The myriad protein-coding genes found in present-day eukaryotes arose from a combination of speciation and gene duplication events, spanning more than one billion years of evolution. Notably, as these proteins evolved, the individual residues at each site in their amino acid sequences were replaced at markedly different rates. The relationship between protein structure, protein function, and site-specific rates of amino acid replacement is a topic of ongoing research. Additionally, there is much interest in the different evolutionary constraints imposed on sequences related by speciation (orthologs) versus sequences related by gene duplication (paralogs). A principal aim of this dissertation is to …


Constrained Sequence Alignment, Kyle Daling Dec 2017

Constrained Sequence Alignment, Kyle Daling

WWU Honors College Senior Projects

Constrained Sequence Alignment: A new algorithm designed to help biologists produce better alignment for protein sequences.


Registration And Grouping Algorithms In Protein Nmr Derived Peak Lists And Their Application In Protein Nmr Reference Correction, Andrey Smelter, Xi Chen, Eric C. Rouchka, Hunter N. B. Moseley Oct 2017

Registration And Grouping Algorithms In Protein Nmr Derived Peak Lists And Their Application In Protein Nmr Reference Correction, Andrey Smelter, Xi Chen, Eric C. Rouchka, Hunter N. B. Moseley

Commonwealth Computational Summit

Nuclear magnetic resonance spectroscopy of proteins (protein NMR) is a powerful analytical technique for studying structure and dynamics of proteins. Almost all aspects of protein NMR have been accelerated by the development of software tools that enable the analysis of NMR spectral data and its utilization in studying protein structure and dynamics. This includes software for raw NMR processing, spectral visualization, protein resonance assignment, and structure determination. However, full automation of protein NMR data analysis is still a work in progress and data analysis still requires an expert NMR spectroscopist utilizing an array of software tools.

While manual resonance assignment …


Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, Jyotsna Ramanan, Peter Revesz Jul 2017

Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, Jyotsna Ramanan, Peter Revesz

School of Computing: Faculty Publications

Evolutionary studies usually assume that the genetic mutations are independent of each other. However, that does not imply that the observed mutations are independent of each other because it is possible that when a nucleotide is mutated, then it may be biologically beneficial if an adjacent nucleotide mutates too. With a number of decoded genes currently available in various genome libraries and online databases, it is now possible to have a large-scale computer-based study to test whether the independence assumption holds for pairs of adjacent amino acids. Hence the independence question also arises for pairs of adjacent amino acids within …


Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane Apr 2017

Network Exploration Of Correlated Multivariate Protein Data For Alzheimer's Disease Association, Matthew J. Lane

Theses

Alzheimer Disease (AD) is difficult to diagnose by using genetic testing or other traditional methods. Unlike diseases with simple genetic risk components, there exists no single marker determining as to whether someone will develop AD. Furthermore, AD is highly heterogeneous and different subgroups of individuals develop the disease due to differing factors. Traditional diagnostic methods using perceivable cognitive deficiencies are often too little too late due to the brain having suffered damage from decades of disease progression. In order to observe AD at early stages prior to the observation of cognitive deficiencies, biomarkers with greater accuracy are required. By using …


An Effective Computational Method Incorporating Multiple Secondary Structure Predictions In Topology Determination For Cryo-Em Images, Abhishek Biswas, Desh Ranjan, Mohammad Zubair, Stephanie Zeil, Kamal Al Nasr, Jing He Jan 2017

An Effective Computational Method Incorporating Multiple Secondary Structure Predictions In Topology Determination For Cryo-Em Images, Abhishek Biswas, Desh Ranjan, Mohammad Zubair, Stephanie Zeil, Kamal Al Nasr, Jing He

Computer Science Faculty Publications

A key idea in de novo modeling of a medium-resolution density image obtained from cryo-electron microscopy is to compute the optimal mapping between the secondary structure traces observed in the density image and those predicted on the protein sequence. When secondary structures are not determined precisely, either from the image or from the amino acid sequence of the protein, the computational problem becomes more complex. We present an efficient method that addresses the secondary structure placement problem in presence of multiple secondary structure predictions and computes the optimal mapping. We tested the method using 12 simulated images from alpha-proteins and …


Rcd+: Fast Loop Modeling Server, José R. López-Blanco, Alejandro J. Canosa-Valis, Yaohang Li, Pablo Chacón Jan 2016

Rcd+: Fast Loop Modeling Server, José R. López-Blanco, Alejandro J. Canosa-Valis, Yaohang Li, Pablo Chacón

Computer Science Faculty Publications

Modeling loops is a critical and challenging step in protein modeling and prediction. We have developed a quick online service (http://rcd.chaconlab.org) for ab initio loop modeling combining a coarse-grained conformational search with a full-atom refinement. Our original Random Coordinate Descent (RCD) loop closure algorithm has been greatly improved to enrich the sampling distribution towards near-native conformations. These improvements include a new workflow optimization, MPI-parallelization and fast backbone angle sampling based on neighbor-dependent Ramachandran probability distributions. The server starts by efficiently searching the vast conformational space from only the loop sequence information and the environment atomic coordinates. The generated closed loop …


Mutations Of Adjacent Amino Acid Pairs Are Not Always Independent, Jyotsna Ramanan, Peter Revesz Oct 2015

Mutations Of Adjacent Amino Acid Pairs Are Not Always Independent, Jyotsna Ramanan, Peter Revesz

CSE Conference and Workshop Papers

Evolutionary studies usually assume that the genetic mutations are independent of each other. This paper tests the independence hypothesis for genetic mutations with regard to protein coding regions. According to the new experimental results the independence assumption generally holds, but there are certain exceptions. In particular, the coding regions that represent two adjacent amino acids seem to change in ways that sometimes deviate significantly from the expected theoretical probability under the independence assumption.


An Incremental Phylogenetic Tree Algorithm Based On Repeated Insertions Of Species, Peter Revesz, Zhiqiang Li Oct 2015

An Incremental Phylogenetic Tree Algorithm Based On Repeated Insertions Of Species, Peter Revesz, Zhiqiang Li

CSE Conference and Workshop Papers

In this paper, we introduce a new phylogenetic tree algorithm that generates phylogenetic trees by repeatedly inserting species one-by-one. The incremental phylogenetic tree algorithm can work on proteins or DNA sequences. Computer experiments show that the new algorithm is better than the commonly used UPGMA and Neighbor Joining algorithms.


Minimotif Miner 3.0: Database Expansion And Significantly Improved Reduction Of False-Positive Predictions From Consensus Sequences., Tian Mi, Jerlin Camilus Merlin, Sandeep Deverasetty, Michael R. Gryk, Travis J. Bill, Andrew W. Brooks, Logan Lee, Viraj Rathnayake, Christian A. Ross, David P. Sargeant, Christy L. Strong, Paula Watts, Sanguthevar Rajasekaran, Martin Schiller Jan 2012

Minimotif Miner 3.0: Database Expansion And Significantly Improved Reduction Of False-Positive Predictions From Consensus Sequences., Tian Mi, Jerlin Camilus Merlin, Sandeep Deverasetty, Michael R. Gryk, Travis J. Bill, Andrew W. Brooks, Logan Lee, Viraj Rathnayake, Christian A. Ross, David P. Sargeant, Christy L. Strong, Paula Watts, Sanguthevar Rajasekaran, Martin Schiller

Life Sciences Faculty Research

Minimotif Miner (MnM available at http://minimotifminer.org or http://mnm.engr.uconn.edu) is an online database for identifying new minimotifs in protein queries. Minimotifs are short contiguous peptide sequences that have a known function in at least one protein. Here we report the third release of the MnM database which has now grown 60-fold to approximately 300,000 minimotifs. Since short minimotifs are by their nature not very complex we also summarize a new set of false-positive filters and linear regression scoring that vastly enhance minimotif prediction accuracy on a test data set. This online database can be used to predict new functions in proteins …


Partitioning Of Minimotifs Based On Function With Improved Prediction Accuracy, Sanguthevar Rajasekaran, Tian Mi, Jerlin Camilus Merlin, Aaron Oommen, Patrick R. Gradie, Martin R. Schiller Apr 2010

Partitioning Of Minimotifs Based On Function With Improved Prediction Accuracy, Sanguthevar Rajasekaran, Tian Mi, Jerlin Camilus Merlin, Aaron Oommen, Patrick R. Gradie, Martin R. Schiller

Life Sciences Faculty Research

Background

Minimotifs are short contiguous peptide sequences in proteins that are known to have a function in at least one other protein. One of the principal limitations in minimotif prediction is that false positives limit the usefulness of this approach. As a step toward resolving this problem we have built, implemented, and tested a new data-driven algorithm that reduces false-positive predictions.

Methodology/Principal Findings

Certain domains and minimotifs are known to be strongly associated with a known cellular process or molecular function. Therefore, we hypothesized that by restricting minimotif predictions to those where the minimotif containing protein and target protein have …


A Proposed Syntax For Minimotif Semantics, Version 1., Jay Vyas, Ronald J. Nowling, Mark W. Maciejewski, Sanguthevar Rajasekaran, Michael R. Gryk, Martin R. Schiller Aug 2009

A Proposed Syntax For Minimotif Semantics, Version 1., Jay Vyas, Ronald J. Nowling, Mark W. Maciejewski, Sanguthevar Rajasekaran, Michael R. Gryk, Martin R. Schiller

Life Sciences Faculty Research

BACKGROUND:

One of the most important developments in bioinformatics over the past few decades has been the observation that short linear peptide sequences (minimotifs) mediate many classes of cellular functions such as protein-protein interactions, molecular trafficking and post-translational modifications. As both the creators and curators of a database which catalogues minimotifs, Minimotif Miner, the authors have a unique perspective on the commonalities of the many functional roles of minimotifs. There is an obvious usefulness in standardizing functional annotations both in allowing for the facile exchange of data between various bioinformatics resources, as well as the internal clustering of sets of …


Minimotif Miner 2nd Release: A Database And Web System For Motif Search, Sanguthevar Rajasekaran, Sudha Balla, Patrick R. Gradie, Michael R. Gryk, Krishna Kadaveru, Vamsi Kundeti, Mark W. Maciejewski, Tian Mi, Nicholas Rubino, Jay Vyas, Martin R. Schiller Jan 2009

Minimotif Miner 2nd Release: A Database And Web System For Motif Search, Sanguthevar Rajasekaran, Sudha Balla, Patrick R. Gradie, Michael R. Gryk, Krishna Kadaveru, Vamsi Kundeti, Mark W. Maciejewski, Tian Mi, Nicholas Rubino, Jay Vyas, Martin R. Schiller

Life Sciences Faculty Research

Minimotif Miner (MnM) consists of a minimotif database and a web-based application that enables prediction of motif-based functions in user-supplied protein queries. We have revised MnM by expanding the database more than 10-fold to approximately 5000 motifs and standardized the motif function definitions. The web-application user interface has been redeveloped with new features including improved navigation, screencast-driven help, support for alias names and expanded SNP analysis. A sample analysis of prion shows how MnM 2 can be used.


Determining Domain Similarity And Domain-Protein Similarity Using Functional Similarity Measurements Of Gene Ontology Terms, Lisa Michelle Guntly, Jennifer Leopold, Anne M. Maglia Oct 2007

Determining Domain Similarity And Domain-Protein Similarity Using Functional Similarity Measurements Of Gene Ontology Terms, Lisa Michelle Guntly, Jennifer Leopold, Anne M. Maglia

Computer Science Faculty Research & Creative Works

Protein domains typically correspond to major functional sites of a protein. Therefore, determining similarity between domains can aid in the comparison of protein functions, and can provide a basis for grouping domains based on function. One strategy for comparing domain similarity and domain-protein similarity is to use similarity measurements of annotation terms from the Gene Ontology (GO). In this paper five methods are analyzed in terms of their usefulness for comparing domains, and comparing domains to proteins based on GO terms.