Big Data Proteogenomics And High Performance Computing: Challenges And Opportunities, 2015 Western Michigan University
Big Data Proteogenomics And High Performance Computing: Challenges And Opportunities, Fahad Saeed
Parallel Computing and Data Science Lab Technical Reports
Proteogenomics is an emerging field of systems biology research at the intersection of proteomics and genomics. Two high-throughput technologies, Mass Spectrometry (MS) for proteomics and Next Generation Sequencing (NGS) machines for genomics are required to conduct proteogenomics studies. Independently both MS and NGS technologies are inflicted with data deluge which creates problems of storage, transfer, analysis and visualization. Integrating these big data sets (NGS+MS) for proteogenomics studies compounds all of the associated computational problems. Existing sequential algorithms for these proteogenomics datasets analysis are inadequate for big data and high performance computing (HPC) solutions are almost non-existent. The purpose of ...
A Machine Learning Approach To Post-Market Surveillance Of Medical Devices, 2015 Yale University
A Machine Learning Approach To Post-Market Surveillance Of Medical Devices, Jonathan Bates, Shu-Xia Li, Craig Parzynski, Ronald Coifman, Harlan Krumholz, Joseph Ross
Yale Day of Data
Post-market surveillance is a collection of processes and activities used by product manufacturers and regulators, such as the U.S. Food and Drug Administration (FDA) to monitor the safety and effectiveness of medical devices once they are available for use “on the market”. These activities are designed to generate information to identify poorly performing devices and other safety problems, accurately characterize real-world device performance and clinical outcomes, and facilitate the development of new devices, or new uses for existing devices. Typically, a device is monitored by comparing adverse events in the exposed population to a matched unexposed population. This research ...
K-Mer Analysis On Developmental And Housekeeping Enhancer Peaks, 2015 Yale University
K-Mer Analysis On Developmental And Housekeeping Enhancer Peaks, Yunsi Yang, Anurag Sethi, Mark Gerstein
Yale Day of Data
The regulation of gene expression involves interaction between transcriptional enhancers and core promoters. However, the separation between developmental and housekeeping gene regulation remains unknown. Here, we present a method to detect if different core promoters exhibit specificity to certain enhancers within massively parallel assays for enhancer detection. We use k-mers of various length (3-8bp) as sequence features and compare k-mer frequencies between developmental and housekeeping enhancers. This method shows promoter specificity of enhancers in D. melanogaster.
Inferring Plastid Metabolic Pathways Within The Nonphotosynthetic Free-Living Green Algal Genus Polytomella, 2015 The University of Western Ontario
Inferring Plastid Metabolic Pathways Within The Nonphotosynthetic Free-Living Green Algal Genus Polytomella, Sara Asmail
Electronic Thesis and Dissertation Repository
The advent of photosynthesis facilitated the evolution of aerobic life on Earth. However, species such as Prototheca wickerhamii and Plasmodium falciparum, among many others, have lost photosynthesis and opted for a free-living/parasitic lifestyle. Despite this loss, these species have retained the plastid for its metabolic pathways, without which they would die. Polytomella is a nonphotosynthetic free-living alga, closely related to the photosynthetic model organism Chlamydomonas reinhardtii, and has been shown to lack a plastid genome. I set out to determine Polytomella plastid metabolic pathways using bioinformatics to look for mRNA and DNA homologous sequences matching pathway enzymes in model ...
Designing For Practice Development In A Social Learning System: Communicating Norms And Vicarious Experience, 2015 University of Texas Health Science Center at Houston
Designing For Practice Development In A Social Learning System: Communicating Norms And Vicarious Experience, Claire Loe
UT SBMI Dissertations (Open Access)
Over the past quarter century, the United States has experienced an increase in demand for health services. Expanded use of community health workers (CHWs) has been identified as a strategic response for more effective distribution of healthcare resources by alleviating pressures on clinical personnel and infusing prevention education into the community-to-clinical care continuum. Expansion of the CHW workforce poses many challenges. For CHWs to effectively reduce costs and pressures on the healthcare system, ‘expansion’ implies not only increasing their numbers, but also assuring a workforce that has the capacity to perform in diverse settings. I propose a theoretical framework for ...
Local Sequence Assembly Reveals A High-Resolution Profile Of Somatic Structural Variations In 97 Cancer Genomes, 2015 University of Massachusetts Medical School
Local Sequence Assembly Reveals A High-Resolution Profile Of Somatic Structural Variations In 97 Cancer Genomes, Jiali Zhuang, Zhiping Weng
Open Access Articles
Genomic structural variations (SVs) are pervasive in many types of cancers. Characterizing their underlying mechanisms and potential molecular consequences is crucial for understanding the basic biology of tumorigenesis. Here, we engineered a local assembly-based algorithm (laSV) that detects SVs with high accuracy from paired-end high-throughput genomic sequencing data and pinpoints their breakpoints at single base-pair resolution. By applying laSV to 97 tumor-normal paired genomic sequencing datasets across six cancer types produced by The Cancer Genome Atlas Research Network, we discovered that non-allelic homologous recombination is the primary mechanism for generating somatic SVs in acute myeloid leukemia. This finding contrasts with ...
Constructos Teóricos En Economía Común Informática, 2015 Universidad Nacional de La Matanza
Constructos Teóricos En Economía Común Informática, Rodrigo Lopez-Pablos
Repasando elementos de economía comunitaria, solidaría y de la información, se construyen abstracciones teóricas fundamentales en una proto-explicación del rol de la información y el tiempo en la explicación del hecho económico digital y convencional. Infoagregadamente, se sitúa a la emisión informacional como expresión ontológica micro y macroinformática individual y colectiva del ser: el aseguramiento de la infodiversidad civilizatoria generacional; luego, se argumenta sobre la falacia filosófica computacional cognoscitiva detrás de una presunción teórica conceptual equivocada en el estudio y aplicación de lógicas artificiales: su potencial real para la generación de conocimiento híbrido y la creación de conocimiento sin precedentes ...
Collecting Diverse Microorganisms From Rover Spacecraft, 2015 Chicago State University
Collecting Diverse Microorganisms From Rover Spacecraft, Jennifer I. Jacobs, Arianna Jefferson, Heidi Aronson, James Tan, Wayne Schubert, Parag Vaishampayan
STAR (STEM Teacher and Researcher) Program Posters
. The Planetary Protection discipline at NASA’s Jet Propulsion Laboratory develops and implements procedures to prevent both forward and backward contamination between the Earth and solar system bodies. However, there will always be some microorganisms that will be resistant to the strictest of sterilization methods. In order understand the microorganisms found on spacecraft during assembly, and to rapidly identify them, a mass spectrometry approach was developed. As an experimental approach, a custom database was created for a subset of microorganisms in the Planetary Protection Archive. In order to make the database as accurate and efficient as possible, several different procedures ...
Quantitative And Functional Analysis Pipeline For Label-Free Metaproteomics Data And Its Applications, 2015 University of Tennessee - Knoxville
Quantitative And Functional Analysis Pipeline For Label-Free Metaproteomics Data And Its Applications, Lang Ho Lee
Since the large-scale metaproteome was first reported in 2005, metaproteomics has advanced at a tremendous rate both in its quantitative and qualitative metrics. Furthermore metaproteomics is now being applied as a general tool in microbial ecology in a large variety of environmental studies. Though metaproteomics is becoming a useful and even a standard tool for the microbial ecologist, standardized bioinformatics pipelines are not readily available. Therefore, we developed quantitative and functional analysis pipeline for metaproteomics (QFAM) to help analyze large and complicated metaproteomics data in a robust and timely fashion with outputs designed to be simple and clearly understood by ...
Development Of A Comprehensive Massively Parallel Sequencing Panel Of Single Nucleotide Polymorphism And Short Tandem Repeat Markers For Human Identification, 2015 University of North Texas Health Science Center at Fort Worth
Development Of A Comprehensive Massively Parallel Sequencing Panel Of Single Nucleotide Polymorphism And Short Tandem Repeat Markers For Human Identification, David H. Warshauer
Theses and Dissertations
Massively parallel sequencing (MPS) technologies allow for the detection of an unparalleled amount of genetic information with unprecedented speed and relative ease. These qualities make the technology desirable for generating DNA profiles that may be uploaded into forensic offender, arrestee, and family reference database files. This doctoral dissertation research was conducted under the hypothesis that MPS, with its exquisitely high throughput, can provide a system whereby reference samples can be typed for a large battery of markers, providing more discrimination power for forensic DNA typing and offering increased opportunities to develop investigative leads. The design and implementation of large marker ...
A Parallel Algorithm For Compression Of Big Next-Generation Sequencing Datasets, 2015 Western Michigan University
A Parallel Algorithm For Compression Of Big Next-Generation Sequencing Datasets, Sandino N. Vargas Perez, Fahad Saeed
Parallel Computing and Data Science Lab Technical Reports
With the advent of high-throughput next-generation sequencing (NGS) techniques, the amount of data being generated represents challenges including storage, analysis and transport of huge datasets. One solution to storage and transmission of data is compression using specialized compression algorithms. However, these specialized algorithms suffer from poor scalability with increasing size of the datasets and best available solutions can take hours to compress gigabytes of data. In this paper we introduce paraDSRC, a parallel implementation of DSRC algorithm using a message passing model that presents reduction of the compression time complexity by a factor of O(1/p ). Our experimental results ...
Darwin Core Archive File, 2015 Eastern Illinois University
Darwin Core Archive File, Stover-Ebinger Herbarium, Eastern Illinois University
Darwin Core Archive Download
ZIP file contains occurrences.csv, identivications.csv, and images.csv. The meta.xml document describes the content. Fields within the occurrences.csv file are defined by the Darwin Core exchange standard.
Domain Specific Document Retrieval Framework For Real-Time Social Health Data, 2015 Wright State University - Main Campus
Domain Specific Document Retrieval Framework For Real-Time Social Health Data, Swapnil Soni
With the advent of the web search and microblogging, the percentage of Online Health Information Seekers (OHIS) using these online services to share and seek health real-time information has in- creased exponentially. OHIS use web search engines or microblogging search services to seek out latest, relevant as well as reliable health in- formation. When OHIS turn to microblogging search services to search real-time content, trends and breaking news, etc. the search results are not promising. Two major challenges exist in the current microblogging search engines are keyword based techniques and results do not contain real-time information. To address these challenges ...
Evaluating A Potential Commercial Tool For Healthcare Application For People With Dementia, 2015 Wright State University - Main Campus
Evaluating A Potential Commercial Tool For Healthcare Application For People With Dementia, Tanvi Banerjee, Pramod Anantharam, William L. Romine, Larry Wayne Lawhorne
The widespread use of smartphones and sensors has made physiology, environment, and public health notifications amenable to continuous monitoring. Personalized digital health and patient empowerment can become a reality only if the complex multisensory and multimodal data is processed within the patient context, converting relevant medical knowledge into actionable information for better and timely decisions. We apply these principles in the healthcare domain of dementia. Specifically, in this study we validate one of our sensor platforms to ascertain whether it will be suitable for detecting physiological changes that may help us detect changes in people with dementia. This study shows ...
Ferret: A Sentence-Based Literature Scanning System, 2015 University of Iowa
Ferret: A Sentence-Based Literature Scanning System, Padmini Srinivasan, Xiao-Ning Zhang, Roxane Bouten, Caren Chang
Department of Computer Science Publications
The rapid pace of bioscience research makes it very challenging to track relevant articles in one’s area of interest. MEDLINE, a primary source for biomedical literature, offers access to more than 20 million citations with three-quarters of a million new ones added each year. Thus it is not surprising to see active research in building new document retrieval and sentence retrieval systems. We present Ferret, a prototype retrieval system, designed to retrieve and rank sentences (and their documents) conveying gene-centric relationships of interest to a scientist. The prototype has several features. For example, it is designed to handle ...
"Time For Dabs": Analyzing Twitter Data On Butane Hash Oil Use, 2015 Wright State University - Main Campus
"Time For Dabs": Analyzing Twitter Data On Butane Hash Oil Use, Raminta Daniulaityte, Robert G. Carlson, Farahnaz Golroo, Sanjaya Wijeratne, Edward W. Boyer, Silvia S. Martins, Ramzi W. Nahhas, Amit P. Sheth
No abstract provided.
Secretion Of Heat-Labile Enterotoxin By Porcine-Origin Enterotoxigenic Escherichia Coli And Relation To Virulence, 2015 University of Nebraska-Lincoln
Secretion Of Heat-Labile Enterotoxin By Porcine-Origin Enterotoxigenic Escherichia Coli And Relation To Virulence, Prageeth R. Wijemanne
Dissertations & Theses in Veterinary and Biomedical Science
Heat-labile enterotoxin (LT) is an important virulence factor secreted by some strains of porcine-origin enterotoxigenic Escherichia coli (pETEC). The prototypic human-origin strain H10407 secretes LT via a type II secretion system (T2SS), but its presence or importance in pETEC has not been established. Exposure of pETEC to glucose has been shown to result in different secretion levels of LT. Furthermore, the relationship between the level of LT secreted and the virulence potential of the respective pETEC strain has not been established. To determine the relationship between the capacity to secrete LT and virulence in wild-type (WT) pETEC, 16 strains isolated ...
Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, 2015 Yale University
Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, Rolando Garcia-Milian
The decreased cost of high-throughput technologies has enabled its use as the main research methods to study biological processes and disorders. In order to understand the relevance of the data generated by these methods, the researcher needs mining and integrating the enormous amount of biomedical information and knowledge contained in the text of the scientific literature and biomedical databases. Accordingly, the ability to access and examine molecular data should not be restricted to bioinformaticians or those with exceptional computer skills. In May 2014, the Cushing/Whitney Medical Library began to provide end-user bioinformatics support to the biomedical researchers of the ...
Entity Recommendations Using Hierarchical Knowledge Bases, 2015 Wright State University - Main Campus
Entity Recommendations Using Hierarchical Knowledge Bases, Siva Kumar Cheekula, Pavan Kapanipathi, Derek Doran, Prateek Jain, Amit P. Sheth
Recent developments in recommendation algorithms have focused on integrating Linked Open Data to augment traditional algorithms with background knowledge. These developments recognize that the integration of Linked Open Data may or better performance, particularly in cold start cases. In this paper, we explore if and how a specific type of Linked Open Data, namely hierarchical knowledge, may be utilized for recommendation systems. We propose a content-based recommendation approaches that adapts a spreading activation algorithm over the DBpedia category structure to identify entities of interest to the user. Evaluation of the algorithm over the Movielens dataset demonstrates that our method yields ...
A Versatile Reporter System For Crispr-Mediated Chromosomal Rearrangements, 2015 Tongji University
A Versatile Reporter System For Crispr-Mediated Chromosomal Rearrangements, Yingxiang Li, Angela I. Park, Haiwei Mou, Cansu Colpan, Aizhan Bizhanova, Elliot Akama-Garren, Nik Joshi, Eric A. Hendrickson, David Feldser, Hao Yin, Daniel G. Anderson, Tyler Jacks, Zhiping Weng, Wen Xue
Open Access Articles
Although chromosomal deletions and inversions are important in cancer, conventional methods for detecting DNA rearrangements require laborious indirect assays. Here we develop fluorescent reporters to rapidly quantify CRISPR/Cas9-mediated deletions and inversions. We find that inversion depends on the non-homologous end-joining enzyme LIG4. We also engineer deletions and inversions for a 50 kb Pten genomic region in mouse liver. We discover diverse yet sequence-specific indels at the rearrangement fusion sites. Moreover, we detect Cas9 cleavage at the fourth nucleotide on the non-complementary strand, leading to staggered instead of blunt DNA breaks. These reporters allow mechanisms of chromosomal rearrangements to be ...