Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Life Sciences

Discerning Novel Splice Junctions Derived From Rna-Seq Alignment: A Deep Learning Approach, Yi Zhang, Xinan Liu, James N. Macleod, Jinze Liu Dec 2018

Discerning Novel Splice Junctions Derived From Rna-Seq Alignment: A Deep Learning Approach, Yi Zhang, Xinan Liu, James N. Macleod, Jinze Liu

Computer Science Faculty Publications

Background: Exon splicing is a regulated cellular process in the transcription of protein-coding genes. Technological advancements and cost reductions in RNA sequencing have made quantitative and qualitative assessments of the transcriptome both possible and widely available. RNA-seq provides unprecedented resolution to identify gene structures and resolve the diversity of splicing variants. However, currently available ab initio aligners are vulnerable to spurious alignments due to random sequence matches and sample-reference genome discordance. As a consequence, a significant set of false positive exon junction predictions would be introduced, which will further confuse downstream analyses of splice variant discovery and abundance estimation.

Results: …


Seqothello: Querying Rna-Seq Experiments At Scale, Ye Yu, Jinpeng Liu, Xinan Liu, Yi Zhang, Eamonn Magner, Erik Lehnert, Chen Qian, Jinze Liu Oct 2018

Seqothello: Querying Rna-Seq Experiments At Scale, Ye Yu, Jinpeng Liu, Xinan Liu, Yi Zhang, Eamonn Magner, Erik Lehnert, Chen Qian, Jinze Liu

Computer Science Faculty Publications

We present SeqOthello, an ultra-fast and memory-efficient indexing structure to support arbitrary sequence query against large collections of RNA-seq experiments. It takes SeqOthello only 5 min and 19.1 GB memory to conduct a global survey of 11,658 fusion events against 10,113 TCGA Pan-Cancer RNA-seq datasets. The query recovers 92.7% of tier-1 fusions curated by TCGA Fusion Gene Database and reveals 270 novel occurrences, all of which are present as tumor-specific. By providing a reference-free, alignment-free, and parameter-free sequence search system, SeqOthello will enable large-scale integrative studies using sequence-level data, an undertaking not previously practicable for many individual labs.


Transcriptional Response Of Honey Bee (Apis Mellifera) To Differential Nutritional Status And Nosema Infection, Farida Azzouz-Olden, Arthur G. Hunt, Gloria Degrandi-Hoffman Aug 2018

Transcriptional Response Of Honey Bee (Apis Mellifera) To Differential Nutritional Status And Nosema Infection, Farida Azzouz-Olden, Arthur G. Hunt, Gloria Degrandi-Hoffman

Plant and Soil Sciences Faculty Publications

Background: Bees are confronting several environmental challenges, including the intermingled effects of malnutrition and disease. Intuitively, pollen is the healthiest nutritional choice, however, commercial substitutes, such as Bee-Pro and MegaBee, are widely used. Herein we examined how feeding natural and artificial diets shapes transcription in the abdomen of the honey bee, and how transcription shifts in combination with Nosema parasitism.

Results: Gene ontology enrichment revealed that, compared with poor diet (carbohydrates [C]), bees fed pollen (P > C), Bee-Pro (B > C), and MegaBee (M > C) showed a broad upregulation of metabolic processes, especially lipids; however, pollen feeding promoted more functions, and …


Imapsplice: Alleviating Reference Bias Through Personalized Rna-Seq Alignment, Xinan Liu, James N. Macleod, Jinze Liu Aug 2018

Imapsplice: Alleviating Reference Bias Through Personalized Rna-Seq Alignment, Xinan Liu, James N. Macleod, Jinze Liu

Computer Science Faculty Publications

Genomic variants in both coding and non-coding sequences can have functionally important and sometimes deleterious effects on exon splicing of gene transcripts. For transcriptome profiling using RNA-seq, the accurate alignment of reads across exon junctions is a critical step. Existing algorithms that utilize a standard reference genome as a template sometimes have difficulty in mapping reads that carry genomic variants. These problems can lead to allelic ratio biases and the failure to detect splice variants created by splice site polymorphisms. To improve RNA-seq read alignment, we have developed a novel approach called iMapSplice that enables personalized mRNA transcriptome profiling. The …


Novel Computational Methods For Sequencing Data Analysis: Mapping, Query, And Classification, Xinan Liu Jan 2018

Novel Computational Methods For Sequencing Data Analysis: Mapping, Query, And Classification, Xinan Liu

Theses and Dissertations--Computer Science

Over the past decade, the evolution of next-generation sequencing technology has considerably advanced the genomics research. As a consequence, fast and accurate computational methods are needed for analyzing the large data in different applications. The research presented in this dissertation focuses on three areas: RNA-seq read mapping, large-scale data query, and metagenomics sequence classification.

A critical step of RNA-seq data analysis is to map the RNA-seq reads onto a reference genome. This dissertation presents a novel splice alignment tool, MapSplice3. It achieves high read alignment and base mapping yields and is able to detect splice junctions, gene fusions, and circular …