Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Physical Sciences and Mathematics

Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder Sep 2008

Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder

Faculty Publications, Computer Science

With more and more genomes being sequenced, a lot of effort is devoted to their annotation with terms from controlled vocabularies such as the GeneOntology. Manual annotation based on relevant literature is tedious, but automation of this process is difficult. One particularly challenging problem is word sense disambiguation. Terms such as |development| can refer to developmental biology or to the more general sense. Here, we present two approaches to address this problem by using term co-occurrences and document clustering. To evaluate our method we defined a corpus of 331 documents on development and developmental biology. Term co-occurrence analysis achieves an …


Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder Sep 2008

Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder

William B. Andreopoulos

With more and more genomes being sequenced, a lot of effort is devoted to their annotation with terms from controlled vocabularies such as the GeneOntology. Manual annotation based on relevant literature is tedious, but automation of this process is difficult. One particularly challenging problem is word sense disambiguation. Terms such as |development| can refer to developmental biology or to the more general sense. Here, we present two approaches to address this problem by using term co-occurrences and document clustering. To evaluate our method we defined a corpus of 331 documents on development and developmental biology. Term co-occurrence analysis achieves an …


Semantics And Services Enabled Problem Solving Environment For Trypanosoma Cruzi, Amit P. Sheth, Rick L. Tarleton, Mark Musen, Satya S. Sahoo, Prashant Doshi, Natasha Noy Jan 2008

Semantics And Services Enabled Problem Solving Environment For Trypanosoma Cruzi, Amit P. Sheth, Rick L. Tarleton, Mark Musen, Satya S. Sahoo, Prashant Doshi, Natasha Noy

Kno.e.sis Publications

No abstract provided.


On The Tradeoff Between Speedup And Energy Consumption In High Performance Computing – A Bioinformatics Case Study, Sachin Pawaskar, Hesham Ali Jan 2008

On The Tradeoff Between Speedup And Energy Consumption In High Performance Computing – A Bioinformatics Case Study, Sachin Pawaskar, Hesham Ali

Computer Science Faculty Proceedings & Presentations

High Performance Computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a …


Graphics Processor Based Implementation Of Bioinformatics Codes, Andrew Bellenir, Christian Trefftz, Greg Wolffe Jan 2008

Graphics Processor Based Implementation Of Bioinformatics Codes, Andrew Bellenir, Christian Trefftz, Greg Wolffe

Student Summer Scholars Manuscripts

We created a powerful computing platform based on video cards with the goal of accelerating the performance of bioinformatics codes. To satisfy the demands of the video gaming industry, modern graphics processing units (GPUs) have become very advanced computational devices, using a large set of stream processors to render multiple pixels in parallel. Recently, computer scientists have taken interest in a GPU's ability to execute a single instruction on multiple data (SIMD computation) for general applications, as opposed to graphics processing only. This is known as general purpose computation on a graphics processing unit, or GPGPU.

Our project was comprised …


The Impact Of Directionality In Predications On Text Mining, Gondy Leroy, Marcelo Fiszman, Thomas C. Rindflesch Jan 2008

The Impact Of Directionality In Predications On Text Mining, Gondy Leroy, Marcelo Fiszman, Thomas C. Rindflesch

CGU Faculty Publications and Research

The number of publications in biomedicine is increasing enormously each year. To help researchers digest the information in these documents, text mining tools are being developed that present co-occurrence relations between concepts. Statistical measures are used to mine interesting subsets of relations. We demonstrate how directionality of these relations affects interestingness. Support and confidence, simple data mining statistics, are used as proxies for interestingness metrics. We first built a test bed of 126,404 directional relations extracted from biomedical abstracts, which we represent as graphs containing a central starting concept and 2 rings of associated relations. We manipulated directionality in four …


Algorithmic Techniques Employed In The Isolation Of Codon Usage Biases In Prokaryotic Genomes, Douglas W. Raiford Iii Jan 2008

Algorithmic Techniques Employed In The Isolation Of Codon Usage Biases In Prokaryotic Genomes, Douglas W. Raiford Iii

Browse all Theses and Dissertations

While genomic sequencing projects are an abundant source of information for biological studies ranging from the molecular to the ecological in scale, much of the information present may yet be hidden from casual analysis. One such information domain, trends in codon usage, can provide a wealth of information about an organism's genes and their expression. Degeneracy in the genetic code allows more than one triplet codon to code for the same amino acid, and usage of these codons is often biased such that one or more of these synonymous codons is preferred. Isolation of translational efficiency bias can have important …


Medical Language Processing For Patient Diagnosis Using Text Classification And Negation Labelling, Brian Mac Namee, John D. Kelleher, Sarah Jane Delany Jan 2008

Medical Language Processing For Patient Diagnosis Using Text Classification And Negation Labelling, Brian Mac Namee, John D. Kelleher, Sarah Jane Delany

Conference papers

This paper describes the approach of the DIT AIGroup to the i2b2 Obesity Challenge to build a system to diagnose obesity and related co-morbidities from narrative, unstructured patient records. Based on experimental results a system was developed which used knowledge-light text classification using decision trees, and negation labelling.