Life Sciences | Open Access Articles | Digital Commons Network™

Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou Nov 2023

Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Previous efforts in using genome-wide analysis of transcription factor binding sites (TFBSs) have overlooked the importance of ranking potential significant regulatory regions, especially those with repetitive binding within a local region. Identifying these homogenous binding sites is critical because they have the potential to amplify the binding affinity and regulation activity of transcription factors, impacting gene expression and cellular functions. To address this issue, we developed an open-source tool Motif-Cluster that prioritizes and visualizes transcription factor regulatory regions by incorporating the idea of local motif clusters. Motif-Cluster can rank the significant transcription factor regulatory regions without the need for experimental …

Go to article

Sequence-Based Bioinformatics Approaches To Predict Virus-Host Relationships In Archaea And Eukaryotes, Yingshan Li Dec 2022

Sequence-Based Bioinformatics Approaches To Predict Virus-Host Relationships In Archaea And Eukaryotes, Yingshan Li

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Viral metagenomics is independent of lab culturing and capable of investigating viromes of virtually any given environmental niches. While numerous sequences of viral genomes have been assembled from metagenomic studies over the past years, the natural hosts for the majority of these viral contigs have not been determined. Different computational approaches have been developed to predict hosts of bacteria phages. Nevertheless, little progress has been made in the virus-host prediction, especially for viruses that infect eukaryotes and archaea. In this study, by analyzing all documented viruses with known eukaryotic and archaeal hosts, we assessed the predictive power of four computational …

Go to article

A Pipeline To Generate Deep Learning Surrogates Of Genome-Scale Metabolic Models, Achilles Rasquinha Nov 2022

A Pipeline To Generate Deep Learning Surrogates Of Genome-Scale Metabolic Models, Achilles Rasquinha

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Genome-Scale Metabolic Models (GEMMs) are powerful reconstructions of biological systems that help metabolic engineers understand and predict growth conditions subjected to various environmental factors around the cellular metabolism of an organism in observation, purely in silico. Applications of metabolic engineering range from perturbation analysis and drug-target discovery to predicting growth rates of biotechnologically important metabolites and reaction objectives within dierent single-cell and multi-cellular organism types. GEMMs use mathematical frameworks for quantitative estimations of flux distributions within metabolic networks. The reasons behind why an organism activates, stuns, or fluctuates between alternative pathways for growth and survival, however, remain relatively unknown. GEMMs …

Go to article

Comparative Analyses Of De Novo Transcriptome Assembly Pipelines For Diploid Wheat, Natasha Pavlovikj May 2022

Comparative Analyses Of De Novo Transcriptome Assembly Pipelines For Diploid Wheat, Natasha Pavlovikj

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Gene expression and transcriptome analysis are currently one of the main focuses of research for a great number of scientists. However, the assembly of raw sequence data to obtain a draft transcriptome of an organism is a complex multi-stage process usually composed of pre-processing, assembling, and post-processing. Each of these stages includes multiple steps such as data cleaning, error correction and assembly validation. Different combinations of steps, as well as different computational methods for the same step, generate transcriptome assemblies with different accuracy. Thus, using a combination that generates more accurate assemblies is crucial for any novel biological discoveries. Implementing …

Go to article

A Data-Driven Approach For Detecting Stress In Plants Using Hyperspectral Imagery, Suraj Gampa May 2019

A Data-Driven Approach For Detecting Stress In Plants Using Hyperspectral Imagery, Suraj Gampa

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

A phenotype is an observable characteristic of an individual and is a function of its genotype and its growth environment. Individuals with different genotypes are impacted differently by exposure to the same environment. Therefore, phenotypes are often used to understand morphological and physiological changes in plants as a function of genotype and biotic and abiotic stress conditions. Phenotypes that measure the level of stress can help mitigate the adverse impacts on the growth cycle of the plant. Image-based plant phenotyping has the potential for early stress detection by means of computing responsive phenotypes in a non-intrusive manner. A large number …

Go to article

Use Of Clustering Techniques For Protein Domain Analysis, Eric Rodene Jul 2016

Use Of Clustering Techniques For Protein Domain Analysis, Eric Rodene

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Next-generation sequencing has allowed many new protein sequences to be identified. However, this expansion of sequence data limits the ability to determine the structure and function of most of these newly-identified proteins. Inferring the function and relationships between proteins is possible with traditional alignment-based phylogeny. However, this requires at least one shared subsequence. Without such a subsequence, no meaningful alignments between the protein sequences are possible. The entire protein set (or proteome) of an organism contains many unrelated proteins. At this level, the necessary similarity does not occur. Therefore, an alternative method of understanding relationships within diverse sets of proteins …

Go to article

Clustering And Classification Of Multi-Domain Proteins, Neethu Shah Dec 2013

Clustering And Classification Of Multi-Domain Proteins, Neethu Shah

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Rapid development of next-generation sequencing technology has led to an unprecedented growth in protein sequence data repositories over the last decade. Majority of these proteins lack structural and functional characterization. This necessitates design and development of fast, efficient, and sensitive computational tools and algorithms that can classify these proteins into functionally coherent groups.

Domains are fundamental units of protein structure and function. Multi-domain proteins are extremely complex as opposed to proteins that have single or no domains. They exhibit network-like complex evolutionary events such as domain shuffling, domain loss, and domain gain. These events therefore, cannot be represented in the …

Go to article

Protein Structure – Based Method For Identification Of Horizontal Gene Transfer In Bacteria, Swetha Billa May 2011

Protein Structure – Based Method For Identification Of Horizontal Gene Transfer In Bacteria, Swetha Billa

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Horizontal Gene Transfer is defined as the movement of genetic material from one strain of species to another. Bacteria, being an asexual organism were always believed to transfer genes vertically. But recent studies provide evidence that shows bacteria can also transfer genes horizontally.

HGT plays a major role in evolution and medicine. It is the major contributor in bacterial evolution, enabling species to acquire genes to adapt to the new environments. Bacteria are also believed to develop drug resistance to antibiotics through the phenomenon of HGT. Therefore further study of HGT and its implications is necessary to understand the effects …

Go to article

Computational Complexity Of Approximate And Precise Data With Constraint Automaton, Dipty Singh Apr 2011

Computational Complexity Of Approximate And Precise Data With Constraint Automaton, Dipty Singh

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The DNA molecules packaged in structures called chromosomes within the cells of living organisms encode hereditary information that is passed on to their offspring. Using transcription and translation, the genes within these DNA molecules help in protein synthesis. Thus chromosomal DNA serves as a blueprint for the chemical processes of life.

In order to analyze a DNA sequence by currently available technology, we have to cut it into small fragments, e.g. by using restriction enzymes. The application of different restriction enzymes to the multiple copies of the same DNA sequence generates many overlapping fragments. In order to construct the original …

Go to article

Biological Sequence Simulation For Testing Complex Evolutionary Hypotheses: Indel-Seq-Gen Version 2.0, Cory L. Strope Dec 2009

Biological Sequence Simulation For Testing Complex Evolutionary Hypotheses: Indel-Seq-Gen Version 2.0, Cory L. Strope

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Reconstructing the evolutionary history of biological sequences will provide a better understanding of mechanisms of sequence divergence and functional evolution. Long-term sequence evolution includes not only substitutions of residues but also more dynamic changes such as insertion, deletion, and long-range rearrangements. Such dynamic changes make reconstructing sequence evolution history difficult and affect the accuracy of molecular evolutionary methods, such as multiple sequence alignments (MSAs) and phylogenetic methods. In order to test the accuracy of these methods, benchmark datasets are required. However, currently available benchmark datasets have limitations in their sizes and evolutionary histories of the included sequences are unknown. These …

Go to article

Classification, Clustering And Data-Mining Of Biological Data, Thomas Triplet Nov 2009

Classification, Clustering And Data-Mining Of Biological Data, Thomas Triplet

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are currently over 1100 molecular biology databases dispersed throughout the Internet. However, very few of them integrate data from multiple sources. To assist in the functional and evolutionary analysis of the abundant number of novel proteins, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database that integrates data from various biological sources. PROFESS is freely available athttp://cse.unl.edu/~profess/. Our database is designed to be versatile and expandable and will not …

Go to article

Life Sciences Commons^™

Full-Text Articles in Life Sciences

Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Sequence-Based Bioinformatics Approaches To Predict Virus-Host Relationships In Archaea And Eukaryotes, Yingshan Li

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

A Pipeline To Generate Deep Learning Surrogates Of Genome-Scale Metabolic Models, Achilles Rasquinha

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Comparative Analyses Of De Novo Transcriptome Assembly Pipelines For Diploid Wheat, Natasha Pavlovikj

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

A Data-Driven Approach For Detecting Stress In Plants Using Hyperspectral Imagery, Suraj Gampa

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Use Of Clustering Techniques For Protein Domain Analysis, Eric Rodene

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Clustering And Classification Of Multi-Domain Proteins, Neethu Shah

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Protein Structure – Based Method For Identification Of Horizontal Gene Transfer In Bacteria, Swetha Billa

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Computational Complexity Of Approximate And Precise Data With Constraint Automaton, Dipty Singh

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Biological Sequence Simulation For Testing Complex Evolutionary Hypotheses: Indel-Seq-Gen Version 2.0, Cory L. Strope

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Classification, Clustering And Data-Mining Of Biological Data, Thomas Triplet

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research