Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Bioinformatics (30)
- Physical Sciences and Mathematics (14)
- Computer Sciences (11)
- Genetics and Genomics (10)
- Genomics (7)
-
- Computational Biology (6)
- Medicine and Health Sciences (4)
- Biochemistry, Biophysics, and Structural Biology (3)
- Genetics (3)
- Pharmacology, Toxicology and Environmental Health (3)
- Programming Languages and Compilers (3)
- Software Engineering (3)
- Statistics and Probability (3)
- Bioethics and Medical Ethics (2)
- Cell and Developmental Biology (2)
- Laboratory and Basic Science Research (2)
- Research Methods in Life Sciences (2)
- Toxicology (2)
- Animal Sciences (1)
- Aquaculture and Fisheries (1)
- Biochemistry (1)
- Biology (1)
- Biostatistics (1)
- Cancer Biology (1)
- Cell Biology (1)
- Chemicals and Drugs (1)
- Diseases (1)
- Economics (1)
- Environmental Health (1)
- Publication Year
- Publication
-
- Ray Enke Ph.D. (5)
- Jianjun Hu (4)
- Rolando Garcia-Milian (3)
- Shuangge Ma (3)
- William B. Andreopoulos (3)
-
- Catherine Putonti (2)
- George K. Thiruvathukal (2)
- Konstantin Läufer (2)
- Stanley D Dunn (2)
- Ahmed Mustafa Dr. (1)
- Atin Basu Choudhary (1)
- David A. Lightfoot (1)
- Jeffrey S. Morris (1)
- Kai F. Hung (1)
- Martin Stephens, PhD (1)
- Meredith Protas (1)
- Pantelis Bagos (1)
- Ronald Greenberg (1)
- Troy Seidle, PhD (1)
- File Type
Articles 1 - 30 of 36
Full-Text Articles in Life Sciences
Saccharomyces Genome Database & Uniprot Bioinformatics Analysis, Ray A. Enke
Saccharomyces Genome Database & Uniprot Bioinformatics Analysis, Ray A. Enke
Ray Enke Ph.D.
Fast And Space-Efficient Location Of Heavy Or Dense Segments In Run-Length Encoded Sequences, Ronald I. Greenberg
Fast And Space-Efficient Location Of Heavy Or Dense Segments In Run-Length Encoded Sequences, Ronald I. Greenberg
Ronald Greenberg
This paper considers several variations of an optimization problem with potential applications in such areas as biomolecular sequence analysis and image processing. Given a sequence of items, each with a weight and a length, the goal is to find a subsequence of consecutive items of optimal value, where value is either total weight or total weight divided by total length. There may also be a specified lower and/or upper bound on the acceptable length of subsequences. This paper shows that all the variations of the problem are solvable in linear time and space even with non-uniform item lengths and divisible …
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
Konstantin Läufer
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
Konstantin Läufer
RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
Catherine Putonti
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
Catherine Putonti
RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …
Using Phylogenetically-Informed Annotation (Pia) To Search For Light-Interacting Genes In Transcriptomes From Non-Model Organisms, Daniel I. Speiser, M. Sabrina Pankey, Alexander K. Zaharoff, Barbara A. Battelle, Heather D. Bracken-Grissom, Jesse W. Breinholt, Seth M. Bybee, Thomas W. Cronin, Anders Garm, Annie R. Lindgren, Nipam H. Patel, Megan L. Porter, Meredith E. Protas, Anja S. Rivera, Jeanne M. Serb, Kirk S. Zigler, Keith A. Crandall, Todd H. Oakley
Using Phylogenetically-Informed Annotation (Pia) To Search For Light-Interacting Genes In Transcriptomes From Non-Model Organisms, Daniel I. Speiser, M. Sabrina Pankey, Alexander K. Zaharoff, Barbara A. Battelle, Heather D. Bracken-Grissom, Jesse W. Breinholt, Seth M. Bybee, Thomas W. Cronin, Anders Garm, Annie R. Lindgren, Nipam H. Patel, Megan L. Porter, Meredith E. Protas, Anja S. Rivera, Jeanne M. Serb, Kirk S. Zigler, Keith A. Crandall, Todd H. Oakley
Meredith Protas
Background: Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to …
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
George K. Thiruvathukal
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …
Statistical Contributions To Bioinformatics: Design, Modeling, Structure Learning, And Integration, Jeffrey S. Morris, Veera Baladandayuthapani
Statistical Contributions To Bioinformatics: Design, Modeling, Structure Learning, And Integration, Jeffrey S. Morris, Veera Baladandayuthapani
Jeffrey S. Morris
Using Rstudio For Manipulating And Visualizing Data (Updated 11/17), Ray A. Enke, Bejan A. Rasoul
Using Rstudio For Manipulating And Visualizing Data (Updated 11/17), Ray A. Enke, Bejan A. Rasoul
Ray Enke Ph.D.
Genomics Rna-Seq Analysis Part 2_ Kallisto Indexing And Quantification (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad
Genomics Rna-Seq Analysis Part 2_ Kallisto Indexing And Quantification (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad
Ray Enke Ph.D.
Genomics Rna-Seq Analysis Part 3-Sleuth Data Visualization (Updated 11/17), Ray A. Enke, Scott Schumacker
Genomics Rna-Seq Analysis Part 3-Sleuth Data Visualization (Updated 11/17), Ray A. Enke, Scott Schumacker
Ray Enke Ph.D.
Supporting Biomedical Research In The Era Of Omics And Precision Medicine, Rolando Garcia-Milian, Denise Hersey, Nathan Rupp
Supporting Biomedical Research In The Era Of Omics And Precision Medicine, Rolando Garcia-Milian, Denise Hersey, Nathan Rupp
Rolando Garcia-Milian
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Martin Stephens, PhD
Conventional toxicological testing methods are often decades old, costly and low-throughput, with questionable relevance to the human condition. Several of these factors have contributed to a backlog of chemicals that have been inadequately assessed for toxicity. Some authorities have responded to this challenge by implementing large-scale testing programmes. Others have concluded that a paradigm shift in toxicology is warranted. One such call came in 2007 from the United States National Research Council (NRC), which articulated a vision of ‘‘21st century toxicology” based predominantly on non-animal techniques. Potential advantages of such an approach include the capacity to examine a far greater …
Analysis Of Rna-Seq Alignments Using Dna Subway Green Line (Computational), Raymond A. Enke
Analysis Of Rna-Seq Alignments Using Dna Subway Green Line (Computational), Raymond A. Enke
Ray Enke Ph.D.
- Review basic steps of RNA-Seq bioinformatics analysis in DNA Subway Green Line
- View and run basic analytics of RNA-Seq data set in DNA Subway Green Line
Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang
Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang
Jianjun Hu
Background
Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms.
Methods
We formulated the protein sorting motif discovery problem as a classification problem …
Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu
Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu
Jianjun Hu
Background Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues. Results Here …
Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou
Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou
Jianjun Hu
Background Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification. Results In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank …
Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou
Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou
Jianjun Hu
Background Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. Results We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets …
Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, Rolando Garcia-Milian
Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, Rolando Garcia-Milian
Rolando Garcia-Milian
The decreased cost of high-throughput technologies has enabled its use as the main research methods to study biological processes and disorders. In order to understand the relevance of the data generated by these methods, the researcher needs mining and integrating the enormous amount of biomedical information and knowledge contained in the text of the scientific literature and biomedical databases. Accordingly, the ability to access and examine molecular data should not be restricted to bioinformaticians or those with exceptional computer skills. In May 2014, the Cushing/Whitney Medical Library began to provide end-user bioinformatics support to the biomedical researchers of the Yale …
Introduction To Gene Enrichment Analysis Tools, Rolando Garcia-Milian
Introduction To Gene Enrichment Analysis Tools, Rolando Garcia-Milian
Rolando Garcia-Milian
Bioinformatics enrichment tools play an important role in identifying, annotating, and functionally analyzing large list of genes generated by high-throughput technologies (e.g. microarrary, RNA-seq, ChIP-chip). This workshop will provide an overview of the principle, type of enrichments, and the infrastructure of enrichment tools. By using concrete examples, it will also introduce some of the most popular tools for gene enrichment analysis such as DAVID, GSEA, and WebGestalt.
Deciphering The Associations Between Gene Expression And Copy Number Alteration Using A Sparse Double Laplacian Shrinkage Approach, Shuangge Ma
Shuangge Ma
Both gene expression levels (GEs) and copy number alterations (CNAs) have important implications in the development of complex diseases. GEs are partly regulated by CNAs, and much effort has been devoted to understanding their relations. The expression of a gene can be regulated by multiple CNAs, and one CNA can regulate the expression of multiple genes. In addition, multiple GEs (CNAs) can be correlated with each other. The existing methods for associating GEs with CNAs have limitations in deciphering the complex data structures. In this study, we develop a sparse double Laplacian shrinkage approach. It jointly models the effects of …
A Penalized Robust Semiparametric Approach For Gene-Environment Interactions, Shuangge Ma
A Penalized Robust Semiparametric Approach For Gene-Environment Interactions, Shuangge Ma
Shuangge Ma
In genetic and genomic studies, gene-environment (G*E) interactions have important implications. Some of the existing G$\times$E interaction methods are limited by analyzing a small number of G factors at a time, by assuming linear effects of E factors, by assuming no data contamination, and by adopting ineffective selection techniques. In this study, we propose a new approach for identifying important G*E interactions. It jointly models the effects of all E and G factors and their interactions. A partially linear varying coefficient model (PLVCM) is adopted to accommodate possible nonlinear effects of E factors. A rank-based loss function is used to …
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Troy Seidle, PhD
Conventional toxicological testing methods are often decades old, costly and low-throughput, with questionable relevance to the human condition. Several of these factors have contributed to a backlog of chemicals that have been inadequately assessed for toxicity. Some authorities have responded to this challenge by implementing large-scale testing programmes. Others have concluded that a paradigm shift in toxicology is warranted. One such call came in 2007 from the United States National Research Council (NRC), which articulated a vision of ‘‘21st century toxicology” based predominantly on non-animal techniques. Potential advantages of such an approach include the capacity to examine a far greater …
Snp-E: A New Method For Multiple Sequence Alignments Analysis And Accurate Single Nucleotide Polymorphism Evaluation, David A. Lightfoot
Snp-E: A New Method For Multiple Sequence Alignments Analysis And Accurate Single Nucleotide Polymorphism Evaluation, David A. Lightfoot
David A. Lightfoot
Identification of single nucleotide polymorphisms (SNPs) and insertion-deletion mutations are important for discovering the connection between the genetic mutations and complex diseases. The objective of this study was to develop a sensitive and accurate computational method for SNP detection among Multiple Sequence Alignments (MSAs) to be run on Microsoft Office SuiteTM and WindowsTM. The SNP-Evaluator, was designed to simulate the process of human eye visual change-identification. Analysis of three 82-Kbp genomic loci derived from Sanger sequencing and the corresponding SNPs from 31 genomes from IlluminaTM sequencing of soybean (Glycine max L. Merr.) demonstrated that the SNP-E was an effective method …
Penalized Integrative Analysis Of High-Dimensional Omics Data, Shuangge Ma
Penalized Integrative Analysis Of High-Dimensional Omics Data, Shuangge Ma
Shuangge Ma
No abstract provided.
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
George K. Thiruvathukal
RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …
Functionally Compensating Coevolving Positions Are Neither Homoplasic Nor Conserved In Clades, Gregory Gloor, Gaurav Tyagi, Dana Abrassart, Andrew Kingston, Andrew Fernandes, Stanley Dunn, Christopher Brandl
Functionally Compensating Coevolving Positions Are Neither Homoplasic Nor Conserved In Clades, Gregory Gloor, Gaurav Tyagi, Dana Abrassart, Andrew Kingston, Andrew Fernandes, Stanley Dunn, Christopher Brandl
Stanley D Dunn
We demonstrated that a pair of positions in phosphoglycerate kinase that score highly by three nonparametric covariation measures are important for function even though the positions can be occupied by aliphatic, aromatic, or charged residues. Examination of these pairs suggested that the majority of the covariation scores could be explained by within-clade conservation. However, an analysis of diversity showed that the conservation within clades of covarying pairs was indistinguishable from pairs of positions that do not covary, thus ruling out both clade conservation and extensive homoplasy as means to identify covarying positions. Mutagenesis showed that the residues in the covarying …
A Cluster-Based Approach For Biological Hypothesis Testing And Its Application, Ahmed Mustafa
A Cluster-Based Approach For Biological Hypothesis Testing And Its Application, Ahmed Mustafa
Ahmed Mustafa Dr.
No abstract provided.
Using Comparative Genomics For Inquiry-Based Learning To Dissect Virulence Of Escherichia Coli O157:H7 And Yersinia Pestis, David J. Baumler, Lois M. Banta, Kai F. Hung, Jodi A. Schwarz, Eric L. Cabot, Jeremy D. Glasner, Nicole T. Perna
Using Comparative Genomics For Inquiry-Based Learning To Dissect Virulence Of Escherichia Coli O157:H7 And Yersinia Pestis, David J. Baumler, Lois M. Banta, Kai F. Hung, Jodi A. Schwarz, Eric L. Cabot, Jeremy D. Glasner, Nicole T. Perna
Kai F. Hung
Genomics and bioinformatics are topics of increasing interest in undergraduate biological science curricula. Many existing exercises focus on gene annotation and analysis of a single genome. In this paper, we present two educational modules designed to enable students to learn and apply fundamental concepts in comparative genomics using examples related to bacterial pathogenesis. Students first examine alignments of genomes of Escherichia coli O157:H7 strains isolated from three food-poisoning outbreaks using the multiple-genome alignment tool Mauve. Students investigate conservation of virulence factors using the Mauve viewer and by browsing annotations available at the A Systematic Annotation Package for Community Analysis of …