Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication Year
- Publication
- File Type
Articles 1 - 30 of 30
Full-Text Articles in Life Sciences
Saccharomyces Genome Database & Uniprot Bioinformatics Analysis, Ray A. Enke
Saccharomyces Genome Database & Uniprot Bioinformatics Analysis, Ray A. Enke
Ray Enke Ph.D.
Fast And Space-Efficient Location Of Heavy Or Dense Segments In Run-Length Encoded Sequences, Ronald I. Greenberg
Fast And Space-Efficient Location Of Heavy Or Dense Segments In Run-Length Encoded Sequences, Ronald I. Greenberg
Ronald Greenberg
This paper considers several variations of an optimization problem with potential applications in such areas as biomolecular sequence analysis and image processing. Given a sequence of items, each with a weight and a length, the goal is to find a subsequence of consecutive items of optimal value, where value is either total weight or total weight divided by total length. There may also be a specified lower and/or upper bound on the acceptable length of subsequences. This paper shows that all the variations of the problem are solvable in linear time and space even with non-uniform item lengths and divisible …
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
Konstantin Läufer
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
Konstantin Läufer
RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
Catherine Putonti
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
Catherine Putonti
RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …
Using Phylogenetically-Informed Annotation (Pia) To Search For Light-Interacting Genes In Transcriptomes From Non-Model Organisms, Daniel I. Speiser, M. Sabrina Pankey, Alexander K. Zaharoff, Barbara A. Battelle, Heather D. Bracken-Grissom, Jesse W. Breinholt, Seth M. Bybee, Thomas W. Cronin, Anders Garm, Annie R. Lindgren, Nipam H. Patel, Megan L. Porter, Meredith E. Protas, Anja S. Rivera, Jeanne M. Serb, Kirk S. Zigler, Keith A. Crandall, Todd H. Oakley
Using Phylogenetically-Informed Annotation (Pia) To Search For Light-Interacting Genes In Transcriptomes From Non-Model Organisms, Daniel I. Speiser, M. Sabrina Pankey, Alexander K. Zaharoff, Barbara A. Battelle, Heather D. Bracken-Grissom, Jesse W. Breinholt, Seth M. Bybee, Thomas W. Cronin, Anders Garm, Annie R. Lindgren, Nipam H. Patel, Megan L. Porter, Meredith E. Protas, Anja S. Rivera, Jeanne M. Serb, Kirk S. Zigler, Keith A. Crandall, Todd H. Oakley
Meredith Protas
Background: Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to …
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulous, Konstantin Läufer, George K. Thiruvathukal, Catherine Putonti
George K. Thiruvathukal
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …
Statistical Contributions To Bioinformatics: Design, Modeling, Structure Learning, And Integration, Jeffrey S. Morris, Veera Baladandayuthapani
Statistical Contributions To Bioinformatics: Design, Modeling, Structure Learning, And Integration, Jeffrey S. Morris, Veera Baladandayuthapani
Jeffrey S. Morris
Using Rstudio For Manipulating And Visualizing Data (Updated 11/17), Ray A. Enke, Bejan A. Rasoul
Using Rstudio For Manipulating And Visualizing Data (Updated 11/17), Ray A. Enke, Bejan A. Rasoul
Ray Enke Ph.D.
Genomics Rna-Seq Analysis Part 2_ Kallisto Indexing And Quantification (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad
Genomics Rna-Seq Analysis Part 2_ Kallisto Indexing And Quantification (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad
Ray Enke Ph.D.
Genomics Rna-Seq Analysis Part 3-Sleuth Data Visualization (Updated 11/17), Ray A. Enke, Scott Schumacker
Genomics Rna-Seq Analysis Part 3-Sleuth Data Visualization (Updated 11/17), Ray A. Enke, Scott Schumacker
Ray Enke Ph.D.
Supporting Biomedical Research In The Era Of Omics And Precision Medicine, Rolando Garcia-Milian, Denise Hersey, Nathan Rupp
Supporting Biomedical Research In The Era Of Omics And Precision Medicine, Rolando Garcia-Milian, Denise Hersey, Nathan Rupp
Rolando Garcia-Milian
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Martin Stephens, PhD
Conventional toxicological testing methods are often decades old, costly and low-throughput, with questionable relevance to the human condition. Several of these factors have contributed to a backlog of chemicals that have been inadequately assessed for toxicity. Some authorities have responded to this challenge by implementing large-scale testing programmes. Others have concluded that a paradigm shift in toxicology is warranted. One such call came in 2007 from the United States National Research Council (NRC), which articulated a vision of ‘‘21st century toxicology” based predominantly on non-animal techniques. Potential advantages of such an approach include the capacity to examine a far greater …
Analysis Of Rna-Seq Alignments Using Dna Subway Green Line (Computational), Raymond A. Enke
Analysis Of Rna-Seq Alignments Using Dna Subway Green Line (Computational), Raymond A. Enke
Ray Enke Ph.D.
- Review basic steps of RNA-Seq bioinformatics analysis in DNA Subway Green Line
- View and run basic analytics of RNA-Seq data set in DNA Subway Green Line
Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang
Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang
Jianjun Hu
Background
Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms.
Methods
We formulated the protein sorting motif discovery problem as a classification problem …
Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu
Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu
Jianjun Hu
Background Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues. Results Here …
Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou
Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou
Jianjun Hu
Background Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification. Results In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank …
Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou
Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou
Jianjun Hu
Background Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. Results We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets …
Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, Rolando Garcia-Milian
Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, Rolando Garcia-Milian
Rolando Garcia-Milian
The decreased cost of high-throughput technologies has enabled its use as the main research methods to study biological processes and disorders. In order to understand the relevance of the data generated by these methods, the researcher needs mining and integrating the enormous amount of biomedical information and knowledge contained in the text of the scientific literature and biomedical databases. Accordingly, the ability to access and examine molecular data should not be restricted to bioinformaticians or those with exceptional computer skills. In May 2014, the Cushing/Whitney Medical Library began to provide end-user bioinformatics support to the biomedical researchers of the Yale …
Deciphering The Associations Between Gene Expression And Copy Number Alteration Using A Sparse Double Laplacian Shrinkage Approach, Shuangge Ma
Shuangge Ma
Both gene expression levels (GEs) and copy number alterations (CNAs) have important implications in the development of complex diseases. GEs are partly regulated by CNAs, and much effort has been devoted to understanding their relations. The expression of a gene can be regulated by multiple CNAs, and one CNA can regulate the expression of multiple genes. In addition, multiple GEs (CNAs) can be correlated with each other. The existing methods for associating GEs with CNAs have limitations in deciphering the complex data structures. In this study, we develop a sparse double Laplacian shrinkage approach. It jointly models the effects of …
A Penalized Robust Semiparametric Approach For Gene-Environment Interactions, Shuangge Ma
A Penalized Robust Semiparametric Approach For Gene-Environment Interactions, Shuangge Ma
Shuangge Ma
In genetic and genomic studies, gene-environment (G*E) interactions have important implications. Some of the existing G$\times$E interaction methods are limited by analyzing a small number of G factors at a time, by assuming linear effects of E factors, by assuming no data contamination, and by adopting ineffective selection techniques. In this study, we propose a new approach for identifying important G*E interactions. It jointly models the effects of all E and G factors and their interactions. A partially linear varying coefficient model (PLVCM) is adopted to accommodate possible nonlinear effects of E factors. A rank-based loss function is used to …
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Bringing Toxicology Into The 21st Century: A Global Call To Action, Troy Seidle, Martin Stephens
Troy Seidle, PhD
Conventional toxicological testing methods are often decades old, costly and low-throughput, with questionable relevance to the human condition. Several of these factors have contributed to a backlog of chemicals that have been inadequately assessed for toxicity. Some authorities have responded to this challenge by implementing large-scale testing programmes. Others have concluded that a paradigm shift in toxicology is warranted. One such call came in 2007 from the United States National Research Council (NRC), which articulated a vision of ‘‘21st century toxicology” based predominantly on non-animal techniques. Potential advantages of such an approach include the capacity to examine a far greater …
Penalized Integrative Analysis Of High-Dimensional Omics Data, Shuangge Ma
Penalized Integrative Analysis Of High-Dimensional Omics Data, Shuangge Ma
Shuangge Ma
No abstract provided.
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer
George K. Thiruvathukal
RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …
The Evolution Of Cooperation: How Patience Matters, Atin Basu Choudhary, Raja Mazumder, Vahan Simoyan
The Evolution Of Cooperation: How Patience Matters, Atin Basu Choudhary, Raja Mazumder, Vahan Simoyan
Atin Basu Choudhary
The evolution of cooperation has been the focus of much attention from evolutionary game theorists. Of course, conventional game theorists often cite the Folk Theorem to suggest that cooperation is very likely as long as people are patient. However, experimental and real world evidence of the Folk Theorem has been sparse. We investigate whether cooperation can evolve endogenously in a population where people have different patience levels. We motivate our model by asking the following question: why don’t biologists cooperate with each other by contributing to biological databases? We apply the Folk theorem from conventional game theory and assume that …
Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder
Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder
William B. Andreopoulos
Mutual Information Without The Influence Of Phylogeny Or Entropy Dramatically Improves Residue Contact Prediction, Stanley Dunn, Lindi Wahl, Gregory Gloor
Mutual Information Without The Influence Of Phylogeny Or Entropy Dramatically Improves Residue Contact Prediction, Stanley Dunn, Lindi Wahl, Gregory Gloor
Stanley D Dunn
Motivation: Compensating alterations during the evolution of protein families give rise to coevolving positions that contain important structural and functional information. However, a high background composed of random noise and phylogenetic components interferes with the identification of coevolving positions.
Results: We have developed a rapid, simple and general method based on information theory that accurately estimates the level of background mutual information for each pair of positions in a given protein family. Removal of this background results in a metric, MIp, that correctly identifies substantially more coevolving positions in protein families than any existing method. A significant fraction of these …
Finding Molecular Complexes Through Multiple Layer Clustering Of Protein Interaction Networks, Bill Andreopoulos, Aijun An, Xiangji Huang, Xiaogang Wang
Finding Molecular Complexes Through Multiple Layer Clustering Of Protein Interaction Networks, Bill Andreopoulos, Aijun An, Xiangji Huang, Xiaogang Wang
William B. Andreopoulos
Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang
Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang
William B. Andreopoulos