Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

PDF

University of Nebraska - Lincoln

Series

Keyword
Publication Year
Publication

Articles 1 - 30 of 34

Full-Text Articles in Physical Sciences and Mathematics

Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou Nov 2023

Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Previous efforts in using genome-wide analysis of transcription factor binding sites (TFBSs) have overlooked the importance of ranking potential significant regulatory regions, especially those with repetitive binding within a local region. Identifying these homogenous binding sites is critical because they have the potential to amplify the binding affinity and regulation activity of transcription factors, impacting gene expression and cellular functions. To address this issue, we developed an open-source tool Motif-Cluster that prioritizes and visualizes transcription factor regulatory regions by incorporating the idea of local motif clusters. Motif-Cluster can rank the significant transcription factor regulatory regions without the need for experimental …


Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone Nov 2023

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone

Complex Biosystems PhD Program: Dissertations

The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …


Ubjective Information And Survival In A Simulated Biological System, Tyler S. Barker, Massimiliano Pierobon, Peter J. Thomas Apr 2022

Ubjective Information And Survival In A Simulated Biological System, Tyler S. Barker, Massimiliano Pierobon, Peter J. Thomas

School of Computing: Faculty Publications

Information transmission and storage have gained traction as unifying concepts to characterize biological systems and their chances of survival and evolution at multiple scales. Despite the potential for an information-based mathematical framework to offer new insights into life processes and ways to interact with and control them, the main legacy is that of Shannon’s, where a purely syntactic characterization of information scores systems on the basis of their maximum information efficiency. The latter metrics seem not entirely suitable for biological systems, where transmission and storage of different pieces of information (carrying different semantics) can result in different chances of survival. …


Introduction To The R-Package: Usdampr, Elliott James Dennis, Bowen Chen Jun 2020

Introduction To The R-Package: Usdampr, Elliott James Dennis, Bowen Chen

Extension Farm and Ranch Management News

Why the Need for the Package? In the 1990’s, concern over growing packer concentration and a hog industry market shock resulted in discontent among producers and packers. As a result, the United States Congress passed the Livestock Mandatory Reporting Act of 1999 (1999 Act) [Pub. L. 106-78, Title IX] which is required to be reauthorized every five years. See here for a full history of the Livestock Mandatory Reporting Background.

Market reports were publicly issued in the form of .txt files with varying frequency from April 2000 to April 2020. Current and historical data were also housed in a USDA-AMS …


Repositories For Taxonomic Data: Where We Are And What Is Missing, Aurélian Miralles, Teddy Bruy, Katherine Wolcott, Mark D. Scherz, Dominik Begerow, Bank Beszteri, Michael Bonkowski, Janine Felden, Birgit Gemeinholzer, Frank Glaw, Frank Oliver Glöckner, Oliver Hawlitschek, Ivaylo Kostadinov, Tim W. Nattkemper, Christian Printzen, Jasmin Renz, Nataliya Rybalka, Marc Stadler, Tanja Weibulat, Thomas Wilke, Susanne S. Renner, Miguel Vences Jan 2020

Repositories For Taxonomic Data: Where We Are And What Is Missing, Aurélian Miralles, Teddy Bruy, Katherine Wolcott, Mark D. Scherz, Dominik Begerow, Bank Beszteri, Michael Bonkowski, Janine Felden, Birgit Gemeinholzer, Frank Glaw, Frank Oliver Glöckner, Oliver Hawlitschek, Ivaylo Kostadinov, Tim W. Nattkemper, Christian Printzen, Jasmin Renz, Nataliya Rybalka, Marc Stadler, Tanja Weibulat, Thomas Wilke, Susanne S. Renner, Miguel Vences

Harold W. Manter Laboratory: Library Materials

Natural history collections are leading successful large-scale projects of specimen digitization (images, metadata, DNA barcodes), thereby transforming taxonomy into a big data science. Yet, little effort has been directed towards safeguarding and subsequently mobilizing the considerable amount of original data generated during the process of naming 15,000–20,000 species every year. From the perspective of alpha-taxonomists, we provide a review of the properties and diversity of taxonomic data, assess their volume and use, and establish criteria for optimizing data repositories. We surveyed 4,113 alpha-taxonomic studies in representative journals for 2002, 2010, and 2018, and found an increasing yet comparatively limited use …


Prospects And Challenges Of Population Health With Online And Other Big Data In Africa; Understanding The Link To Improving Healthcare Service Delivery, Rowland Edet, Bolarinwa Afolabi Jan 2020

Prospects And Challenges Of Population Health With Online And Other Big Data In Africa; Understanding The Link To Improving Healthcare Service Delivery, Rowland Edet, Bolarinwa Afolabi

Department of Sociology: Faculty Publications

Big data analytics offers promises to many health care service challenges and can provide answers to many population health issues. Big data is having a positive impact in almost every sphere of life in more advanced world while developing countries are striving to meet up. Even though healthcare systems in the developed world are recording some breakthroughs due to the application of big data, it is important to research the impact of big data in developing regions of the world, such as Africa and identify its peculiar needs. The purpose of this review was to summarize the challenges faced by …


Connectivity Differences Between Gulf War Illness (Gwi) Phenotypes During A Test Of Attention, Tomas Clarke, Jessie Jamieson, Patrick Malone, Rakib U. Rayhan, Stuart Washington, John W. Vanmeter, James N. Baraniuk Dec 2019

Connectivity Differences Between Gulf War Illness (Gwi) Phenotypes During A Test Of Attention, Tomas Clarke, Jessie Jamieson, Patrick Malone, Rakib U. Rayhan, Stuart Washington, John W. Vanmeter, James N. Baraniuk

Department of Mathematics: Faculty Publications

One quarter of veterans returning from the 1990–1991 Persian Gulf War have developed Gulf War Illness (GWI) with chronic pain, fatigue, cognitive and gastrointestinal dysfunction. Exertion leads to characteristic, delayed onset exacerbations that are not relieved by sleep. We have modeled exertional exhaustion by comparing magnetic resonance images from before and after submaximal exercise. One third of the 27 GWI participants had brain stem atrophy and developed postural tachycardia after exercise (START: Stress Test Activated Reversible Tachycardia). The remainder activated basal ganglia and anterior insulae during a cognitive task (STOPP: Stress Test Originated Phantom Perception). Here, the role of attention …


Linking Taxonomic Diversity And Trophic Function: A Graph-Based Theoretical Approach, Marcella M. Jurotich, Kaitlyn Dougherty, Barbara Hayford, Sally Clark Nov 2017

Linking Taxonomic Diversity And Trophic Function: A Graph-Based Theoretical Approach, Marcella M. Jurotich, Kaitlyn Dougherty, Barbara Hayford, Sally Clark

Transactions of the Nebraska Academy of Sciences and Affiliated Societies

The purpose of this study is to develop a novel, visual method in analyzing complex functional trait data in freshwater ecology. We focus on macroinvertebrates in stream ecosystems under a gradient of habitat degradation and employ a combination of taxonomic and functional trait diversity analyses. Then we use graph theory to link changes in functional trait diversity to taxonomic richness and habitat degradation. We test the hypotheses that: 1) taxonomic diversity and trophic functional trait diversity both decrease with increased habitat degradation; 2) loss of taxa leads to a decrease in trophic function as visualized using a bipartite graph; and …


Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, Jyotsna Ramanan, Peter Revesz Jul 2017

Testing The Independence Hypothesis Of Accepted Mutations For Pairs Of Adjacent Amino Acids In Protein Sequences, Jyotsna Ramanan, Peter Revesz

School of Computing: Faculty Publications

Evolutionary studies usually assume that the genetic mutations are independent of each other. However, that does not imply that the observed mutations are independent of each other because it is possible that when a nucleotide is mutated, then it may be biologically beneficial if an adjacent nucleotide mutates too. With a number of decoded genes currently available in various genome libraries and online databases, it is now possible to have a large-scale computer-based study to test whether the independence assumption holds for pairs of adjacent amino acids. Hence the independence question also arises for pairs of adjacent amino acids within …


Homestead National Monument Of America, Bat Acoustic Monitoring, September 2016, Daniel S. Licht Mar 2017

Homestead National Monument Of America, Bat Acoustic Monitoring, September 2016, Daniel S. Licht

United States National Park Service: Publications

Abstract

Homestead National Monument of America is a 211-acre park located in an agrarian landscape in southeastern Nebraska. From September 16 to October 1, 2016, park staff deployed acoustic monitors at three sites in the park for purposes of monitoring night-time bat activity. The three sites averaged 179, 48, and 33 bat detections per night. Night-time bat activity was generally highest in the 1-2 hours following sunset.

Based on the acoustic surveys the big brown (Eptesicus fuscus), eastern red (Lasiurus borealis), northern long-eared (Myotis septentrionalis) and evening bats (Nycticeius humeralis) were present at the …


Biosimp: Using Software Testing Techniques For Sampling And Inference In Biological Organisms, Mikaela Cashman, Jennie L. Catlett, Myra B. Cohen, Nicole R. Buan, Zahmeeth Sakkaff, Massimiliano Pierobon, Christine A. Kelley Jan 2017

Biosimp: Using Software Testing Techniques For Sampling And Inference In Biological Organisms, Mikaela Cashman, Jennie L. Catlett, Myra B. Cohen, Nicole R. Buan, Zahmeeth Sakkaff, Massimiliano Pierobon, Christine A. Kelley

CSE Conference and Workshop Papers

Years of research in software engineering have given us novel ways to reason about, test, and predict the behavior of complex software systems that contain hundreds of thousands of lines of code. Many of these techniques have been inspired by nature such as genetic algorithms, swarm intelligence, and ant colony optimization. In this paper we reverse the direction and present BioSIMP, a process that models and predicts the behavior of biological organisms to aid in the emerging field of systems biology. It utilizes techniques from testing and modeling of highly-configurable software systems. Using both experimental and simulation data we show …


Incremental Phylogenetics By Repeated Insertions: An Evolutionary Tree Algorithm, Peter Revesz, Zhiqiang Li Aug 2016

Incremental Phylogenetics By Repeated Insertions: An Evolutionary Tree Algorithm, Peter Revesz, Zhiqiang Li

School of Computing: Faculty Publications

We introduce the idea of constructing hypothetical evolutionary trees using an incremental algorithm that inserts species one-by-one into the current evolutionary tree. The method of incremental phylogenetics by repeated insertions lead to an algorithm that can be used on DNA, RNA and amino acid sequences. According to experimental results on both synthetic and biological data, the new algorithm generates more accurate evolutionary trees than the UPGMA and the Neighbor Joining algorithms.


Use Of Clustering Techniques For Protein Domain Analysis, Eric Rodene Jul 2016

Use Of Clustering Techniques For Protein Domain Analysis, Eric Rodene

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Next-generation sequencing has allowed many new protein sequences to be identified. However, this expansion of sequence data limits the ability to determine the structure and function of most of these newly-identified proteins. Inferring the function and relationships between proteins is possible with traditional alignment-based phylogeny. However, this requires at least one shared subsequence. Without such a subsequence, no meaningful alignments between the protein sequences are possible. The entire protein set (or proteome) of an organism contains many unrelated proteins. At this level, the necessary similarity does not occur. Therefore, an alternative method of understanding relationships within diverse sets of proteins …


Chipathlon: A Competitive Assessment For Gene Regulation Tools., Avi Knecht, Adam Caprez, Istvan Ladunga Apr 2016

Chipathlon: A Competitive Assessment For Gene Regulation Tools., Avi Knecht, Adam Caprez, Istvan Ladunga

UCARE Research Products

When gene regulation of the cell cycle malfunctions, it frequently causes cancer.

Adult, differentiated cells can be reprogrammed to induced pluripotent stem cell; which can then be reprogrammed to heart muscle, skin, etc, to repair damaged tissue (to limited extent in clinical practice).

ChIPathlon: Evaluate the performance of all transcription factor mapping (peak calling) methods. To this end, we will develop a scalable and easy to use super computing pipeline to stage data, compare many different peak calling and differential binding site tools, and store all results into a single database.


A Mitochondrial Dna-Based Computational Model Of The Spread Of Human Populations, Peter Revesz Mar 2016

A Mitochondrial Dna-Based Computational Model Of The Spread Of Human Populations, Peter Revesz

School of Computing: Faculty Publications

This paper presents a mitochondrial DNA-based computational model of the spread of human populations. The computation model is based on a new measure of the relatedness of two populations that may be both heterogeneous in terms of their set of mtDNA haplogroups. The measure gives an exponentially increasing weight for the similarity of two haplogroups with the number of levels shared in the mtDNA classification tree. In an experiment, the computational model is applied to the study of the relatedness of seven human populations ranging from the Neolithic through the Bronze Age to the present. The human populations included in …


Mutations Of Adjacent Amino Acid Pairs Are Not Always Independent, Jyotsna Ramanan, Peter Revesz Oct 2015

Mutations Of Adjacent Amino Acid Pairs Are Not Always Independent, Jyotsna Ramanan, Peter Revesz

CSE Conference and Workshop Papers

Evolutionary studies usually assume that the genetic mutations are independent of each other. This paper tests the independence hypothesis for genetic mutations with regard to protein coding regions. According to the new experimental results the independence assumption generally holds, but there are certain exceptions. In particular, the coding regions that represent two adjacent amino acids seem to change in ways that sometimes deviate significantly from the expected theoretical probability under the independence assumption.


An Incremental Phylogenetic Tree Algorithm Based On Repeated Insertions Of Species, Peter Revesz, Zhiqiang Li Oct 2015

An Incremental Phylogenetic Tree Algorithm Based On Repeated Insertions Of Species, Peter Revesz, Zhiqiang Li

CSE Conference and Workshop Papers

In this paper, we introduce a new phylogenetic tree algorithm that generates phylogenetic trees by repeatedly inserting species one-by-one. The incremental phylogenetic tree algorithm can work on proteins or DNA sequences. Computer experiments show that the new algorithm is better than the commonly used UPGMA and Neighbor Joining algorithms.


Bioinformatic Game Theory And Its Application To Cluster Multi-Domain Proteins, Brittney Keel May 2015

Bioinformatic Game Theory And Its Application To Cluster Multi-Domain Proteins, Brittney Keel

Department of Mathematics: Dissertations, Theses, and Student Research

The exact evolutionary history of any set of biological sequences is unknown, and all phylogenetic reconstructions are approximations. The problem becomes harder when one must consider a mix of vertical and lateral phylogenetic signals. In this dissertation we propose a game-theoretic approach to clustering biological sequences and analyzing their evolutionary histories. In this context we use the term evolution as a broad descriptor for the entire set of mechanisms driving the inherited characteristics of a population. The key assumption in our development is that evolution tries to accommodate the competing forces of selection, of which the conservation force seeks to …


New Statistical Methods For Analysis Of Historical Data From Wildlife Populations, Trevor Hefley Mar 2014

New Statistical Methods For Analysis Of Historical Data From Wildlife Populations, Trevor Hefley

Department of Statistics: Dissertations, Theses, and Student Work

Wildlife biologists, many times with the help of ordinary citizens, have developed and maintained long-term datasets for monitoring the status of wildlife populations. These datasets can range from a collection of citizen-reported sightings of a rare species, to datasets collected by biologists using standardized methods. The commonality is that these datasets span a temporal and spatial scale that is beyond the scope of most scientific studies. Ensuring the continued persistence of wildlife populations requires predictions of the impact of human actions. Regardless if the predictions are quantitative or qualitative, the best we can do is use the past data to …


Information In Biological Systems And The Fluctuation Theorem, Yaşar Demirel Jan 2014

Information In Biological Systems And The Fluctuation Theorem, Yaşar Demirel

Department of Chemical and Biomolecular Engineering: Faculty Publications

Some critical trends in information theory, its role in living systems and utilization in fluctuation theory are discussed. The mutual information of thermodynamic coupling is incorporated into the generalized fluctuation theorem by using information theory and nonequilibrium thermodynamics. Thermodynamically coupled dissipative structures in living systems are capable of degrading more energy, and processing complex information through developmental and environmental constraints. The generalized fluctuation theorem can quantify the hysteresis observed in the amount of the irreversible work in nonequilibrium regimes in the presence of information and thermodynamic coupling.


Finding Them Before They Find Us: Informatics, Parasites, And Environments In Accelerating Climate Change, Daniel R. Brooks, Eric P. Hoberg, Walter A. Boeger, Scott Lyell Gardner, Kurt E. Galbreath, David Herczeg, Hugo H. Mejía-Madrid, S. Elizabeth Rácz, Altangerel Tsogtsaikhan Dursahinhan Jan 2014

Finding Them Before They Find Us: Informatics, Parasites, And Environments In Accelerating Climate Change, Daniel R. Brooks, Eric P. Hoberg, Walter A. Boeger, Scott Lyell Gardner, Kurt E. Galbreath, David Herczeg, Hugo H. Mejía-Madrid, S. Elizabeth Rácz, Altangerel Tsogtsaikhan Dursahinhan

Harold W. Manter Laboratory of Parasitology: Faculty and Staff Publications

Parasites are agents of disease in humans, livestock, crops, and wildlife and are powerful representations of the ecological and historical context of the diseases they cause. Recognizing a nexus of professional opportunities and global public need, we gathered at the Cedar Point Biological Station of the University of Nebraska in September 2012 to formulate a cooperative and broad platform for providing essential information about the evolution, ecology, and epidemiology of parasites across host groups, parasite groups, geographical regions, and ecosystem types. A general protocol, documentation–assessment–monitoring–action (DAMA), suggests an integrated proposal to build a proactive capacity to understand, anticipate, and respond …


Anthropogenics: Human Influence On Global And Genetic Homogenization Of Parasite Populations, Dante S. Zarlenga, Eric P. Hoberg, Benjamin Rosenthal, Simonetti Mattiucci, Giuseppe Nascetti Jan 2014

Anthropogenics: Human Influence On Global And Genetic Homogenization Of Parasite Populations, Dante S. Zarlenga, Eric P. Hoberg, Benjamin Rosenthal, Simonetti Mattiucci, Giuseppe Nascetti

Harold W. Manter Laboratory of Parasitology: Faculty and Staff Publications

The distribution, abundance, and diversity of life on Earth have been greatly shaped by human activities. This includes the geographic expansion of parasites; however, measuring the extent to which humans have influenced the dissemination and population structure of parasites has been challenging. In-depth comparisons among parasite populations extending to landscape-level processes affecting disease emergence have remained elusive. New research methods have enhanced our capacity to discern human impact, where the tools of population genetics and molecular epidemiology have begun to shed light on our historical and ongoing influence. Only since the 1990s have parasitologists coupled morphological diagnosis, long considered the …


Utilizing Nmr Spectroscopy And Molecular Docking As Tools For The Structural Determination And Functional Annotation Of Proteins, Jaime Stark Feb 2013

Utilizing Nmr Spectroscopy And Molecular Docking As Tools For The Structural Determination And Functional Annotation Of Proteins, Jaime Stark

Department of Chemistry: Dissertations, Theses, and Student Research

With the completion of the Human Genome Project in 2001 and the subsequent explosion of organisms with sequenced genomes, we are now aware of nearly 28 million proteins. Determining the role of each of these proteins is essential to our understanding of biology and the development of medical advances. Unfortunately, the experimental approaches to determine protein function are too slow to investigate every protein. Bioinformatics approaches, such as sequence and structure homology, have helped to annotate the functions of many similar proteins. However, despite these computational approaches, approximately 40% of proteins still have no known function. Alleviating this deficit will …


Validation Of Pcr-Based Assays And Laboratory Accreditation For Environmental Detection Of Aquatic Invasive Species, Invasive Species Advisory Committee May 2012

Validation Of Pcr-Based Assays And Laboratory Accreditation For Environmental Detection Of Aquatic Invasive Species, Invasive Species Advisory Committee

National Invasive Species Council

This white paper provides:

a) Background information on the use, accuracy and reliability of PCR-based assays such as environmentally sampled DNA (eDNA) for early detection of aquatic invasive species (AIS) and;

b) Recommendations for establishing a system for validating assays and accrediting laboratories that report on the presence or absence of AIS.

This white paper was developed by the members of ISAC and discusses the need for developing validation requirements for Polymerase Chain Reaction (PCR) and other DNA-based molecular assays that are increasingly being used to detect AIS. It does not provide a simplified checklist for evaluation of their ability …


A Study Of Correlations Between The Definition And Application Of The Gene Ontology, Yuji Mo Dec 2011

A Study Of Correlations Between The Definition And Application Of The Gene Ontology, Yuji Mo

Department of Computer Electronics and Engineering: Dissertations, Theses, and Student Research

When using the Gene Ontology (GO), nucleotide and amino acid sequences are annotated by terms in a structured and controlled vocabulary organized into relational graphs. The usage of the vocabulary (GO terms) in the annotation of these sequences may diverge from the relations defined in the ontology. We measure the consistency of the use of GO terms by comparing GO's defined structure to the terms' application. To do this, we first use synthetic data with different characteristics to understand how these characteristics influence the correlation values determined by various similarity measures. Using these results as a baseline, we found that …


Protein Structure – Based Method For Identification Of Horizontal Gene Transfer In Bacteria, Swetha Billa May 2011

Protein Structure – Based Method For Identification Of Horizontal Gene Transfer In Bacteria, Swetha Billa

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Horizontal Gene Transfer is defined as the movement of genetic material from one strain of species to another. Bacteria, being an asexual organism were always believed to transfer genes vertically. But recent studies provide evidence that shows bacteria can also transfer genes horizontally.

HGT plays a major role in evolution and medicine. It is the major contributor in bacterial evolution, enabling species to acquire genes to adapt to the new environments. Bacteria are also believed to develop drug resistance to antibiotics through the phenomenon of HGT. Therefore further study of HGT and its implications is necessary to understand the effects …


Computational Complexity Of Approximate And Precise Data With Constraint Automaton, Dipty Singh Apr 2011

Computational Complexity Of Approximate And Precise Data With Constraint Automaton, Dipty Singh

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The DNA molecules packaged in structures called chromosomes within the cells of living organisms encode hereditary information that is passed on to their offspring. Using transcription and translation, the genes within these DNA molecules help in protein synthesis. Thus chromosomal DNA serves as a blueprint for the chemical processes of life.

In order to analyze a DNA sequence by currently available technology, we have to cut it into small fragments, e.g. by using restriction enzymes. The application of different restriction enzymes to the multiple copies of the same DNA sequence generates many overlapping fragments. In order to construct the original …


Genbank, Dennis A. Benson, Ilene Karasch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers Jan 2010

Genbank, Dennis A. Benson, Ilene Karasch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers

Harold W. Manter Laboratory: Library Materials

GenBank(R) is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data …


Biological Sequence Simulation For Testing Complex Evolutionary Hypotheses: Indel-Seq-Gen Version 2.0, Cory L. Strope Dec 2009

Biological Sequence Simulation For Testing Complex Evolutionary Hypotheses: Indel-Seq-Gen Version 2.0, Cory L. Strope

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Reconstructing the evolutionary history of biological sequences will provide a better understanding of mechanisms of sequence divergence and functional evolution. Long-term sequence evolution includes not only substitutions of residues but also more dynamic changes such as insertion, deletion, and long-range rearrangements. Such dynamic changes make reconstructing sequence evolution history difficult and affect the accuracy of molecular evolutionary methods, such as multiple sequence alignments (MSAs) and phylogenetic methods. In order to test the accuracy of these methods, benchmark datasets are required. However, currently available benchmark datasets have limitations in their sizes and evolutionary histories of the included sequences are unknown. These …


Classification, Clustering And Data-Mining Of Biological Data, Thomas Triplet Nov 2009

Classification, Clustering And Data-Mining Of Biological Data, Thomas Triplet

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are currently over 1100 molecular biology databases dispersed throughout the Internet. However, very few of them integrate data from multiple sources. To assist in the functional and evolutionary analysis of the abundant number of novel proteins, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database that integrates data from various biological sources. PROFESS is freely available athttp://cse.unl.edu/~profess/. Our database is designed to be versatile and expandable and will not …