Apply Data Clustering To Gene Expression Data, 2015 California State University, San Bernardino
Apply Data Clustering To Gene Expression Data, Abdullah Jameel Abualhamayl Mr.
Electronic Theses, Projects, and Dissertations
Data clustering plays an important role in effective analysis of gene expression. Although DNA microarray technology facilitates expression monitoring, several challenges arise when dealing with gene expression datasets. Some of these challenges are the enormous number of genes, the dimensionality of the data, and the change of data over time. The genetic groups which are biologically interlinked can be identified through clustering. This project aims to clarify the steps to apply clustering analysis of genes involved in a published dataset. The methodology for this project includes the selection of the dataset representation, the selection of gene datasets, Similarity Matrix Selection ...
Satpdb: A Database Of Structurally Annotated Therapeutic Peptides, 2015 Institute of Microbial Technology, IMTECH
Satpdb: A Database Of Structurally Annotated Therapeutic Peptides, Sandeep Singh
SATPdb (http://crdd.osdd.net/raghava/satpdb/) is a database of structurally annotated therapeutic peptides, curated from 22 public domain peptide databases/datasets including 9 of our own. The current version holds 19192 unique experimentally validated therapeutic peptide sequences having length between 2 and 50 amino acids. It covers peptides having natural, non-natural and modified residues. These peptides were systematically grouped into 10 categories based on their major function or therapeutic property like 1099 anticancer, 10585 antimicrobial, 1642 drug delivery and 1698 antihypertensive peptides. We assigned or annotated structure of these therapeutic peptides using structural databases (Protein Data Bank) and ...
Bacteriophages Isolated From Lake Michigan Demonstrate Broad Host-Range Across Several Bacterial Phyla, 2015 Loyola University Chicago
Bacteriophages Isolated From Lake Michigan Demonstrate Broad Host-Range Across Several Bacterial Phyla, Kema Malki, Alex Kula, Katherine Bruder, Emily Sible, Thomas Hatzopoulos, Stephanie Steidel, Siobhan C. Watkins, Catherine Putonti
Biology: Faculty Publications and Other Works
The study of bacteriophages continues to generate key information about microbial interactions in the environment. Many phenotypic characteristics of bacteriophages cannot be examined by sequencing alone, further highlighting the necessity for isolation and examination of phages from environmental samples. While much of our current knowledge base has been generated by the study of marine phages, freshwater viruses are understudied in comparison. Our group has previously conducted metagenomics-based studies samples collected from Lake Michigan - the data presented in this study relate to four phages that were extracted from the same samples.
Four phages were extracted from Lake Michigan on ...
A Survey Of Big Data Research, 2015 University of Massachusetts Medical School
A Survey Of Big Data Research, Hua (Julia) Fang, Zhaoyang Zhang, Chanpaul Jin Wang, Mahmoud Daneshmand, Chonggang Wang, Honggang Wang
Quantitative Health Sciences Publications and Presentations
Big data create values for business and research, but pose significant challenges in terms of networking, storage, management, analytics, and ethics. Multidisciplinary collaborations from engineers, computer scientists, statisticians, and social scientists are needed to tackle, discover, and understand big data. This survey presents an overview of big data initiatives, technologies, and research in industries and academia, and discusses challenges and potential solutions.
Evolution Of Mobile Promoters In Prokaryotic Genomes., 2015 The University of Western ontario
Evolution Of Mobile Promoters In Prokaryotic Genomes., Mahnaz Rabbani
Electronic Thesis and Dissertation Repository
Mobile genetic elements are important factors in evolution, and greatly influence the structure of genomes, facilitating the development of new adaptive characteristics. The dynamics of these mobile elements can be described using various mathematical and statistical models. In this thesis, we focus on a specific category of mobile genetic elements, i.e. mobile promoters, which are mobile regions of DNA that initiate the transcription of genes. We present a class of mathematical models for the evolution of mobile promoters in prokaryotic genomes, based on data obtained from available sequenced genomes. Our novel location-based model incorporates two biologically meaningful regions of ...
Big Data Proteogenomics And High Performance Computing: Challenges And Opportunities, 2015 Western Michigan University
Big Data Proteogenomics And High Performance Computing: Challenges And Opportunities, Fahad Saeed
Parallel Computing and Data Science Lab Technical Reports
Proteogenomics is an emerging field of systems biology research at the intersection of proteomics and genomics. Two high-throughput technologies, Mass Spectrometry (MS) for proteomics and Next Generation Sequencing (NGS) machines for genomics are required to conduct proteogenomics studies. Independently both MS and NGS technologies are inflicted with data deluge which creates problems of storage, transfer, analysis and visualization. Integrating these big data sets (NGS+MS) for proteogenomics studies compounds all of the associated computational problems. Existing sequential algorithms for these proteogenomics datasets analysis are inadequate for big data and high performance computing (HPC) solutions are almost non-existent. The purpose of ...
A Machine Learning Approach To Post-Market Surveillance Of Medical Devices, 2015 Yale University
A Machine Learning Approach To Post-Market Surveillance Of Medical Devices, Jonathan Bates, Shu-Xia Li, Craig Parzynski, Ronald Coifman, Harlan Krumholz, Joseph Ross
Yale Day of Data
Post-market surveillance is a collection of processes and activities used by product manufacturers and regulators, such as the U.S. Food and Drug Administration (FDA) to monitor the safety and effectiveness of medical devices once they are available for use “on the market”. These activities are designed to generate information to identify poorly performing devices and other safety problems, accurately characterize real-world device performance and clinical outcomes, and facilitate the development of new devices, or new uses for existing devices. Typically, a device is monitored by comparing adverse events in the exposed population to a matched unexposed population. This research ...
K-Mer Analysis On Developmental And Housekeeping Enhancer Peaks, 2015 Yale University
K-Mer Analysis On Developmental And Housekeeping Enhancer Peaks, Yunsi Yang, Anurag Sethi, Mark Gerstein
Yale Day of Data
The regulation of gene expression involves interaction between transcriptional enhancers and core promoters. However, the separation between developmental and housekeeping gene regulation remains unknown. Here, we present a method to detect if different core promoters exhibit specificity to certain enhancers within massively parallel assays for enhancer detection. We use k-mers of various length (3-8bp) as sequence features and compare k-mer frequencies between developmental and housekeeping enhancers. This method shows promoter specificity of enhancers in D. melanogaster.
Inferring Plastid Metabolic Pathways Within The Nonphotosynthetic Free-Living Green Algal Genus Polytomella, 2015 The University of Western Ontario
Inferring Plastid Metabolic Pathways Within The Nonphotosynthetic Free-Living Green Algal Genus Polytomella, Sara Asmail
Electronic Thesis and Dissertation Repository
The advent of photosynthesis facilitated the evolution of aerobic life on Earth. However, species such as Prototheca wickerhamii and Plasmodium falciparum, among many others, have lost photosynthesis and opted for a free-living/parasitic lifestyle. Despite this loss, these species have retained the plastid for its metabolic pathways, without which they would die. Polytomella is a nonphotosynthetic free-living alga, closely related to the photosynthetic model organism Chlamydomonas reinhardtii, and has been shown to lack a plastid genome. I set out to determine Polytomella plastid metabolic pathways using bioinformatics to look for mRNA and DNA homologous sequences matching pathway enzymes in model ...
An Exploration Of The Phylogenetic Placement Of Recently Discovered Ultrasmall Archaeal Lineages, 2015 University of Connecticut - Storrs
An Exploration Of The Phylogenetic Placement Of Recently Discovered Ultrasmall Archaeal Lineages, Jeffrey M. O'Brien
Honors Scholar Theses
In recent years, several new clades within the domain Achaea have been discovered. This is due in part to microbiological sampling of novel environments, and the increasing ability to detect and sequence uncultivable organisms through metagenomic analysis. These organisms share certain features, such as small cell size and streamlined genomes. Reduction in genome size can present difficulties to phylogenetic reconstruction programs. Since there is less genetic data to work with, these organisms often have missing genes in concatenated multiple sequence alignments. Evolutionary Biologists have not reached a consensus on the placement of these lineages in the archaeal evolutionary tree. There ...
Designing For Practice Development In A Social Learning System: Communicating Norms And Vicarious Experience, 2015 University of Texas Health Science Center at Houston
Designing For Practice Development In A Social Learning System: Communicating Norms And Vicarious Experience, Claire Loe
UT SBMI Dissertations (Open Access)
Over the past quarter century, the United States has experienced an increase in demand for health services. Expanded use of community health workers (CHWs) has been identified as a strategic response for more effective distribution of healthcare resources by alleviating pressures on clinical personnel and infusing prevention education into the community-to-clinical care continuum. Expansion of the CHW workforce poses many challenges. For CHWs to effectively reduce costs and pressures on the healthcare system, ‘expansion’ implies not only increasing their numbers, but also assuring a workforce that has the capacity to perform in diverse settings. I propose a theoretical framework for ...
Local Sequence Assembly Reveals A High-Resolution Profile Of Somatic Structural Variations In 97 Cancer Genomes, 2015 University of Massachusetts Medical School
Local Sequence Assembly Reveals A High-Resolution Profile Of Somatic Structural Variations In 97 Cancer Genomes, Jiali Zhuang, Zhiping Weng
Open Access Articles
Genomic structural variations (SVs) are pervasive in many types of cancers. Characterizing their underlying mechanisms and potential molecular consequences is crucial for understanding the basic biology of tumorigenesis. Here, we engineered a local assembly-based algorithm (laSV) that detects SVs with high accuracy from paired-end high-throughput genomic sequencing data and pinpoints their breakpoints at single base-pair resolution. By applying laSV to 97 tumor-normal paired genomic sequencing datasets across six cancer types produced by The Cancer Genome Atlas Research Network, we discovered that non-allelic homologous recombination is the primary mechanism for generating somatic SVs in acute myeloid leukemia. This finding contrasts with ...
Constructos Teóricos En Economía Común Informática, 2015 Universidad Nacional de La Matanza
Constructos Teóricos En Economía Común Informática, Rodrigo Lopez-Pablos
Repasando elementos de economía comunitaria, solidaría y de la información, se construyen abstracciones teóricas fundamentales en una proto-explicación del rol de la información y el tiempo en la explicación del hecho económico digital y convencional. Infoagregadamente, se sitúa a la emisión informacional como expresión ontológica micro y macroinformática individual y colectiva del ser: el aseguramiento de la infodiversidad civilizatoria generacional; luego, se argumenta sobre la falacia filosófica computacional cognoscitiva detrás de una presunción teórica conceptual equivocada en el estudio y aplicación de lógicas artificiales: su potencial real para la generación de conocimiento híbrido y la creación de conocimiento sin precedentes ...
Survey Of Viral Populations Within Lake Michigan Nearshore Waters At Four Chicago Area Beaches, 2015 Loyola University Chicago
Survey Of Viral Populations Within Lake Michigan Nearshore Waters At Four Chicago Area Beaches, Emily Sible, Alexandria Cooper, Kema Malki, Katherine Bruder, Siobhan C. Watkins, Yuriy Fofanov, Catherine Putonti
Biology: Faculty Publications and Other Works
In comparison to the oceans, freshwater environments represent a more diverse community of microorganisms, exhibiting comparatively high levels of variability both temporally and spatially Maranger and Bird, Microb. Ecol. 31 (1996) 141-151. This level of variability is likely to extend to the world of viruses as well, in particular bacteria-infecting viruses (bacteriophages). Phages are known to influence bacterial diversity, and therefore key processes, in environmental niches across the globe Clokie et al., Bacteriophage 1 (2011) 31-45; Jacquet et al., Adv. Ocean Limn. 1 (2010) 97-141; Wilhelm and Suttle, Bioscience 49 (1999) 781-788; Bratback et al., Microb. Ecol. 28 (1994) 209-221 ...
Collecting Diverse Microorganisms From Rover Spacecraft, 2015 Chicago State University
Collecting Diverse Microorganisms From Rover Spacecraft, Jennifer I. Jacobs, Arianna Jefferson, Heidi Aronson, James Tan, Wayne Schubert, Parag Vaishampayan
STAR (STEM Teacher and Researcher) Program Posters
. The Planetary Protection discipline at NASA’s Jet Propulsion Laboratory develops and implements procedures to prevent both forward and backward contamination between the Earth and solar system bodies. However, there will always be some microorganisms that will be resistant to the strictest of sterilization methods. In order understand the microorganisms found on spacecraft during assembly, and to rapidly identify them, a mass spectrometry approach was developed. As an experimental approach, a custom database was created for a subset of microorganisms in the Planetary Protection Archive. In order to make the database as accurate and efficient as possible, several different procedures ...
Quantitative And Functional Analysis Pipeline For Label-Free Metaproteomics Data And Its Applications, 2015 University of Tennessee - Knoxville
Quantitative And Functional Analysis Pipeline For Label-Free Metaproteomics Data And Its Applications, Lang Ho Lee
Since the large-scale metaproteome was first reported in 2005, metaproteomics has advanced at a tremendous rate both in its quantitative and qualitative metrics. Furthermore metaproteomics is now being applied as a general tool in microbial ecology in a large variety of environmental studies. Though metaproteomics is becoming a useful and even a standard tool for the microbial ecologist, standardized bioinformatics pipelines are not readily available. Therefore, we developed quantitative and functional analysis pipeline for metaproteomics (QFAM) to help analyze large and complicated metaproteomics data in a robust and timely fashion with outputs designed to be simple and clearly understood by ...
Development Of A Comprehensive Massively Parallel Sequencing Panel Of Single Nucleotide Polymorphism And Short Tandem Repeat Markers For Human Identification, 2015 University of North Texas Health Science Center at Fort Worth
Development Of A Comprehensive Massively Parallel Sequencing Panel Of Single Nucleotide Polymorphism And Short Tandem Repeat Markers For Human Identification, David H. Warshauer
Theses and Dissertations
Massively parallel sequencing (MPS) technologies allow for the detection of an unparalleled amount of genetic information with unprecedented speed and relative ease. These qualities make the technology desirable for generating DNA profiles that may be uploaded into forensic offender, arrestee, and family reference database files. This doctoral dissertation research was conducted under the hypothesis that MPS, with its exquisitely high throughput, can provide a system whereby reference samples can be typed for a large battery of markers, providing more discrimination power for forensic DNA typing and offering increased opportunities to develop investigative leads. The design and implementation of large marker ...
A Parallel Algorithm For Compression Of Big Next-Generation Sequencing Datasets, 2015 Western Michigan University
A Parallel Algorithm For Compression Of Big Next-Generation Sequencing Datasets, Sandino N. Vargas Perez, Fahad Saeed
Parallel Computing and Data Science Lab Technical Reports
With the advent of high-throughput next-generation sequencing (NGS) techniques, the amount of data being generated represents challenges including storage, analysis and transport of huge datasets. One solution to storage and transmission of data is compression using specialized compression algorithms. However, these specialized algorithms suffer from poor scalability with increasing size of the datasets and best available solutions can take hours to compress gigabytes of data. In this paper we introduce paraDSRC, a parallel implementation of DSRC algorithm using a message passing model that presents reduction of the compression time complexity by a factor of O(1/p ). Our experimental results ...
Darwin Core Archive File, 2015 Eastern Illinois University
Darwin Core Archive File, Stover-Ebinger Herbarium, Eastern Illinois University
Darwin Core Archive Download
ZIP file contains occurrences.csv, identivications.csv, and images.csv. The meta.xml document describes the content. Fields within the occurrences.csv file are defined by the Darwin Core exchange standard.
Domain Specific Document Retrieval Framework For Real-Time Social Health Data, 2015 Wright State University - Main Campus
Domain Specific Document Retrieval Framework For Real-Time Social Health Data, Swapnil Soni
With the advent of the web search and microblogging, the percentage of Online Health Information Seekers (OHIS) using these online services to share and seek health real-time information has in- creased exponentially. OHIS use web search engines or microblogging search services to seek out latest, relevant as well as reliable health in- formation. When OHIS turn to microblogging search services to search real-time content, trends and breaking news, etc. the search results are not promising. Two major challenges exist in the current microblogging search engines are keyword based techniques and results do not contain real-time information. To address these challenges ...