Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

782 Full-Text Articles 1,845 Authors 200,538 Downloads 108 Institutions

All Articles in Computational Biology

Faceted Search

782 full-text articles. Page 22 of 33.

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang 2016 Fox Chase Cancer Center

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for …


Structural Basis For Mutation-Induced Destabilization Of Profilin 1 In Als, Sivakumar Boopathy, Tania Silvas, Maeve Tischbein, Silvia Jansen, Shivender Shandilya, Jill Zitzewitz, John Landers, Bruce Goode, Celia Schiffer, Daryl Bosco 2016 University of Massachusetts Medical School

Structural Basis For Mutation-Induced Destabilization Of Profilin 1 In Als, Sivakumar Boopathy, Tania Silvas, Maeve Tischbein, Silvia Jansen, Shivender Shandilya, Jill Zitzewitz, John Landers, Bruce Goode, Celia Schiffer, Daryl Bosco

Celia A. Schiffer

Mutations in profilin 1 (PFN1) are associated with amyotrophic lateral sclerosis (ALS); however, the pathological mechanism of PFN1 in this fatal disease is unknown. We demonstrate that ALS-linked mutations severely destabilize the native conformation of PFN1 in vitro and cause accelerated turnover of the PFN1 protein in cells. This mutation-induced destabilization can account for the high propensity of ALS-linked variants to aggregate and also provides rationale for their reported loss-of-function phenotypes in cell-based assays. The source of this destabilization is illuminated by the X-ray crystal structures of several PFN1 proteins, revealing an expanded cavity near the protein core of the …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret 2016 University of Washington - Seattle Campus

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris 2016 The University of Texas M.D. Anderson Cancer Center

Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris

Jeffrey S. Morris

We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on …


System Genetic Analysis Of Mechanisms Underlying Excessive Alcohol Consumption, Maren L. Smith 2016 Virginia Commonwealth University

System Genetic Analysis Of Mechanisms Underlying Excessive Alcohol Consumption, Maren L. Smith

Theses and Dissertations

Increased alcohol consumption over time is one of the characteristic symptoms of Alcohol Use Disorder (AUD). The molecular mechanisms underlying this escalation in intake is still the subject of study. However, the mesocortical and mesolimbic dopamine pathways, and the extended amygdala, because of their involvement in reward and reinforcement are believed to play key roles in these behavioral changes. Multiple gene expression studies have shown that alcohol affects the expression of thousands of genes in the brain. The studies discussed in this document use the systems biology technique of co-expression network analysis to attempt to find

patterns within genome-wide expression …


Ten Simple Rules For Digital Data Storage, E. M. Hart, P. Barmby, D. LeBauer, F. Michonneau, S. Mount, P. Mulrooney, T. Poisot, K. H. Woo, Naupaka B. Zimmerman, J. W. Hollister 2016 University of San Francisco

Ten Simple Rules For Digital Data Storage, E. M. Hart, P. Barmby, D. Lebauer, F. Michonneau, S. Mount, P. Mulrooney, T. Poisot, K. H. Woo, Naupaka B. Zimmerman, J. W. Hollister

Biology Faculty Publications

No abstract provided.


Genomic Prediction Of Gene Bank Wheat Landraces, José Crossa, Diego Jarquin, Jorge Franco, Paulino Pérez-Rodríguez, Juan Burgueño, Carolina Saint-Pierre, Prashant Vikram, Carolina Sansaloni, Cesar Petroli, Denis Akdemir, Clay Sneller, Matthew Reynolds, Maria Tattaris, Thomas Payne, Carlos Guzman, Roberto J. Peña, Peter Wenzl, Sukhwinder Singh 2016 International Maize and Wheat improvement Center (CIMMYT)

Genomic Prediction Of Gene Bank Wheat Landraces, José Crossa, Diego Jarquin, Jorge Franco, Paulino Pérez-Rodríguez, Juan Burgueño, Carolina Saint-Pierre, Prashant Vikram, Carolina Sansaloni, Cesar Petroli, Denis Akdemir, Clay Sneller, Matthew Reynolds, Maria Tattaris, Thomas Payne, Carlos Guzman, Roberto J. Peña, Peter Wenzl, Sukhwinder Singh

Department of Agronomy and Horticulture: Faculty Publications

This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% …


A Pipeline For Creation Of Genome-Scale Metabolic Reconstructions, Shaun W. Norris 2016 Virginia Commonwealth University

A Pipeline For Creation Of Genome-Scale Metabolic Reconstructions, Shaun W. Norris

Theses and Dissertations

The decreasing costs of next generation sequencing technologies and the increasing speeds at which they work have lead to an abundance of 'omic datasets. The need for tools and methods to analyze, annotate, and model these datasets to better understand biological systems is growing. Here we present a novel software pipeline to reconstruct the metabolic model of an organism in silico starting from its genome sequence and a novel compilation of biological databases to better serve the generation of metabolic models. We validate these methods using five Gardnerella vaginalis strains and compare the gene annotation results to NCBI and the …


Characterization Of Somatically-Eliminated Genes During Development Of The Sea Lamprey (Petromyzon Marinus), Stephanie A. Bryant 2016 University of Kentucky

Characterization Of Somatically-Eliminated Genes During Development Of The Sea Lamprey (Petromyzon Marinus), Stephanie A. Bryant

Theses and Dissertations--Biology

The sea lamprey (Petromyzon marinus) undergoes programmed genome rearrangements (PGRs) during early development that facilitate the elimination of ~20% of the genome from the somatic cell lineage, resulting in distinct somatic and germline genomes. To improve our understanding of the evolutionary/developmental logic of PGR, we generated computational predictions to identify candidate germline-specific genes within a transcriptomic dataset derived from adult germline and the embryonic stages encompassing PGR. Validation studies identified 44 germline-specific genes and characterized patterns of transcription and DNA loss during early embryogenesis. Expression analyses reveal that several of these genes are differentially expressed during early embryogenesis …


Resolving Gnetum Evolutionary History, Angela McFadden 2016 Central Washington University

Resolving Gnetum Evolutionary History, Angela Mcfadden

All Master's Theses

Gnetum are non-flowering seed plants of the tropics, indigenous to South America, Africa, and Asia. This group of about 40 species is fascinating to botanists because it shares distinctive morphological characteristics with flowering plants, such as broad leaves, woody stems, and flower-like strobili. There are still questions surrounding the relationships within the genus of Gnetum. With that in mind, I focused my work on generating phylogenetic hypotheses, using two molecular data sets: a concatenation of over 60 different chloroplast genes (66,815 base pairs), and the whole chloroplast genome (128,772 base pairs). This allowed me to compare the two phylogenies …


Deep Models For Brain Em Image Segmentation: Novel Insights And Improved Performance, Ahmed Fakhry, Hanchuan Peng, Shuiwang Ji 2016 Old Dominion University

Deep Models For Brain Em Image Segmentation: Novel Insights And Improved Performance, Ahmed Fakhry, Hanchuan Peng, Shuiwang Ji

Computer Science Faculty Publications

Motivation: Accurate segmentation of brain electron microscopy (EM) images is a critical step in dense circuit reconstruction. Although deep neural networks (DNNs) have been widely used in a number of applications in computer vision, most of these models that proved to be effective on image classification tasks cannot be applied directly to EM image segmentation, due to the different objectives of these tasks. As a result, it is desirable to develop an optimized architecture that uses the full power of DNNs and tailored specifically for EM image segmentation.

Results: In this work, we proposed a novel design of DNNs for …


Finding Function In The Unknown, Kelly Boyd, Emma Highland, Amanda Misch, Amber Hu, Sushma Reddy, Catherine Putonti 2015 Loyola University Chicago

Finding Function In The Unknown, Kelly Boyd, Emma Highland, Amanda Misch, Amber Hu, Sushma Reddy, Catherine Putonti

Bioinformatics Faculty Publications

Through high-throughput RNA sequencing (RNAseq), transcriptomes for a single cell, tissue, or organism(s) can be ascertained at a high resolution. While a number of bioinformatic tools have been developed for transcriptome analyses, significant challenges exist for studies of non-model organisms. Without a reference sequence available, raw reads must first be assembled de novo followed by the tedious task of BLAST searches and data mining for functional information. We have created a pipeline, PyRanger, to automate this process. The pipeline includes functionality to assess a single transcriptome and also facilitate comparative transcriptomic studies.


Leveraging Global Gene Expression Patterns To Predict Expression Of Unmeasured Genes, James Rudd, René A. Zelaya, Eugene Demidenko, Ellen L. Goode, Casey S. Greene S. Greene, Jennifer A. Doherty 2015 Dartmouth College

Leveraging Global Gene Expression Patterns To Predict Expression Of Unmeasured Genes, James Rudd, René A. Zelaya, Eugene Demidenko, Ellen L. Goode, Casey S. Greene S. Greene, Jennifer A. Doherty

Dartmouth Scholarship

BackgroundLarge collections of paraffin-embedded tissue represent a rich resource to test hypotheses based on gene expression patterns; however, measurement of genome-wide expression is cost-prohibitive on a large scale. Using the known expression correlation structure within a given disease type (in this case, high grade serous ovarian cancer; HGSC), we sought to identify reduced sets of directly measured (DM) genes which could accurately predict the expression of a maximized number of unmeasured genes.


Identifying Gene-Gene Interactions That Are Highly Associated With Body Mass Index Using Quantitative Multifactor Dimensionality Reduction (Qmdr), Rishika De, Shefali S. Verma, Fotios Drenos, Emily R. Holzinger 2015 Dartmouth College

Identifying Gene-Gene Interactions That Are Highly Associated With Body Mass Index Using Quantitative Multifactor Dimensionality Reduction (Qmdr), Rishika De, Shefali S. Verma, Fotios Drenos, Emily R. Holzinger

Dartmouth Scholarship

Despite heritability estimates of 40–70% for obesity, less than 2% of its variation is explained by Body Mass Index (BMI) associated loci that have been identified so far. Epistasis, or gene-gene interactions are a plausible source to explain portions of the missing heritability of BMI. Using genotypic data from 18,686 individuals across five study cohorts – ARIC, CARDIA, FHS, CHS, MESA – we filtered SNPs (Single Nucleotide Polymorphisms) using two parallel approaches. SNPs were filtered either on the strength of their main effects of association with BMI, or on the number of knowledge sources supporting a specific SNP-SNP interaction in …


The Importance Of Physicochemical Characteristics And Nonlinear Classifiers In Determining Hiv-1 Protease Specificity, Timmy Manning, Paul Walsh 2015 Department of Computer Science, Cork Institute of Technology, Cork, Ireland

The Importance Of Physicochemical Characteristics And Nonlinear Classifiers In Determining Hiv-1 Protease Specificity, Timmy Manning, Paul Walsh

Department of Biological Sciences Publications

This paper reviews recent research relating to the application of bioinformatics approaches to determining HIV-1 protease specificity, outlines outstanding issues, and presents a new approach to addressing these issues. Leading machine learning theory for the problem currently suggests that the direct encoding of the physicochemical properties of the amino acid substrates is not required for optimal performance. A number of amino acid encoding approaches which incorporate potentially relevant physicochemical properties of the substrate are identified, and are evaluated using a nonlinear task decomposition based neuroevolution algorithm. The results are evaluated, and compared against a recent benchmark set on a nonlinear …


Multipartite Graph Algorithms For The Analysis Of Heterogeneous Data, Charles Alexander Phillips 2015 University of Tennessee - Knoxville

Multipartite Graph Algorithms For The Analysis Of Heterogeneous Data, Charles Alexander Phillips

Doctoral Dissertations

The explosive growth in the rate of data generation in recent years threatens to outpace the growth in computer power, motivating the need for new, scalable algorithms and big data analytic techniques. No field may be more emblematic of this data deluge than the life sciences, where technologies such as high-throughput mRNA arrays and next generation genome sequencing are routinely used to generate datasets of extreme scale. Data from experiments in genomics, transcriptomics, metabolomics and proteomics are continuously being added to existing repositories. A goal of exploratory analysis of such omics data is to illuminate the functions and relationships of …


Application Of Hidden Markov Model Based Methods For Gaining Insights Into Protein Domain Evolution And Function, Amit Anil Upadhyay 2015 University of Tennessee - Knoxville

Application Of Hidden Markov Model Based Methods For Gaining Insights Into Protein Domain Evolution And Function, Amit Anil Upadhyay

Doctoral Dissertations

With the explosion in the amount of available sequence data, computational methods have become indispensable for studying proteins. Domains are the fundamental structural, functional and evolutionary units that make up proteins. Studying protein domains is an important part of understanding protein function and evolution. Hidden Markov Models (HMM) are one of the most successful methods that have been applied for protein sequence and structure analysis. In this study, HMM based methods were applied to study the evolution of sensory domains in microbial signal transduction systems as well as functional characterization and identification of cellulases in metagenomics datasets. Use of HMM …


Applications Of Evolutionary Bioinformatics In Basic And Biomedical Research, Ogun Adebali 2015 University of Tennessee - Knoxville

Applications Of Evolutionary Bioinformatics In Basic And Biomedical Research, Ogun Adebali

Doctoral Dissertations

With the revolutionary progress in sequencing technologies, computational biology emerged as a game-changing field which is applied in understanding molecular events of life for not only complementary but also exploratory purposes. Bioinformatics resources and tools significantly help in data generation, organization and analysis. However, there is still a need for developing new approaches built based on a biologist’s point of view. In protein bioinformatics, there are several fundamental problems such as (i) determining protein function; (ii) identifying protein-protein interactions; (iii) predicting the effect of amino acid variants. Here, I present three chapters addressing these problems from an evolutionary perspective. Firstly, …


A Survey Of The Common Loon (Gavia Immer) Genome Reveals Patterns Of Natural Selection, Zach G. Gayk 2015 Northern Michigan University

A Survey Of The Common Loon (Gavia Immer) Genome Reveals Patterns Of Natural Selection, Zach G. Gayk

All NMU Master's Theses

With rapid advances in Next-Generation Sequencing technology, comparative genomics has become a viable method for studying the adaptation of species to their environment at the genome level. I investigated this in common loons (Gavia immer)—for which molecular adaptation has not been characterized—by finding signatures of positive selection as evidence for genomic adaptation.

I used Illumina short read sequencing data from a single female common loon to produce a fragmented assembly of the common loon (Gavia immer) genome. The resulting assembly had a contig N50 of 814 bp, a total length of 767,326,331 bp, and 45.7 % …


Utilizing In Silico And/Or Native Esi Approaches To Provide New Insights On Haptoglobin/Globin And Haptoglobin/Receptor Interactions, Ololade Fatunmbi 2015 University of Massachusetts Amherst

Utilizing In Silico And/Or Native Esi Approaches To Provide New Insights On Haptoglobin/Globin And Haptoglobin/Receptor Interactions, Ololade Fatunmbi

Doctoral Dissertations

Haptoglobin (Hp), an acute phase protein, binds free hemoglobin (Hb) dimers in one of the strongest non-covalent interactions known in biology. This interaction protects Hb from causing potentially severe oxidative damage and limiting nitric oxide bioavailability. Once Hb/Hp complexes are formed, they proceed to bind CD163, a cell surface receptor on macrophages leading to complex internalization and catabolism. Myoglobin, (Mb) a monomeric protein, that is normally found in the muscle but can be released into the blood in high concentrations during myocardial injury, is homologous to Hb and shares many conserved Hb/Hp interface residues. Both monomeric Hb and Mb species …


Digital Commons powered by bepress