Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Masters Theses

2004

Articles 1 - 10 of 10

Full-Text Articles in Physical Sciences and Mathematics

Maximal Clique Enumeration And Related Tools For Microarray Data Analysis, Nicole E. Baldwin Dec 2004

Maximal Clique Enumeration And Related Tools For Microarray Data Analysis, Nicole E. Baldwin

Masters Theses

The purpose of this study was to investigate the utility of exact maximal clique enumeration in DNA microarray analysis, to analyze and improve upon existing exact maximal clique enumeration algorithms, and to develop new clique-based algorithms to assist in the analysis as indicated during the course of the study. As a first test, microarray data sets comprised of pre-classified human lung tissue samples were obtained through the Critical Assessment of Microarray Data Analysis (CAMDA) conference. A combination of exact maximal clique enumeration and approximate dominating set was used to attempt to classify the samples.

In another test, maximal clique enumeration …


Crown Reductions And Decompositions: Theoretical Results And Practical Methods, William Henry Suters, Iii Dec 2004

Crown Reductions And Decompositions: Theoretical Results And Practical Methods, William Henry Suters, Iii

Masters Theses

Two kernelization schemes for the vertex cover problem, an NP-hard problem in graph theory, are compared. The first, crown reduction, is based on the identification of a graph structure called a crown and is relatively new while the second, LP-kernelization has been used for some time. A proof of the crown reduction algorithm is presented, the algorithm is implemented and theorems are proven concerning its performance. Experiments are conducted comparing the performance of crown reduction and LP- kernelization on real world biological graphs. Next, theorems are presented that provide a logical connection between the crown structure and LP-kernelization. Finally, an …


Probabilistic Suffix Models For Windows Application Behavior Profiling: Framework And Initial Results, Geoffrey Alan Mazeroff Dec 2004

Probabilistic Suffix Models For Windows Application Behavior Profiling: Framework And Initial Results, Geoffrey Alan Mazeroff

Masters Theses

Developing statistical/structural models of code execution behavior is of considerable practical importance. This thesis describes a framework for employing probabilistic suffix models as a means of constructing behavior profiles from code-traces of Windows XP applications. Emphasis is placed on the inference and use of probabilistic suffix trees and automata with new contributions in the area of auxiliary symbol distributions. An initial real-time classification system is discussed and preliminary results of detecting known benign and viral applications are presented.


A Framework For Downloading Wide-Area Files, Rebecca Lynn Collins Dec 2004

A Framework For Downloading Wide-Area Files, Rebecca Lynn Collins

Masters Theses

The challenge of efficiently retrieving files that are broken into segments and replicated across the widearea is of prime importance to wide-area, peer-to-peer, and Grid file systems. Two different algorithms addressing this challenge have been proposed and evaluated. While both have been successful in different performance scenarios, there has been no unifying work that can view both algorithms under a single framework. In this thesis, we define such a framework, where download algorithms are defined in terms of the four dimensions that the client always controls: the number of simultaneous downloads, the degree of work replication, the failover strategy, and …


A Clustering Method Based On Nonnegative Matrix Factorization For Text Mining, Farial Shahnaz Aug 2004

A Clustering Method Based On Nonnegative Matrix Factorization For Text Mining, Farial Shahnaz

Masters Theses

This study presents a methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection. The methodology involves encoding the text data using a low rank nonnegative matrix factorization algorithm to retain natural data nonnegativity, thereby eliminating the need to use subtractive basis vector and encoding calculations present in other techniques such as principal component analysis for semantic feature abstraction. Existing techniques for nonnegative matrix factorization are reviewed and a new hybrid technique for nonnegative matrix factorization is proposed. Performance evaluations of the proposed method is conducted on a few benchmark text collections used in standard …


A Whole Genome Phylogeny Using Truncated Pivoted Qr Decomposition, Shakhina Abdimajidovna Pulatova Aug 2004

A Whole Genome Phylogeny Using Truncated Pivoted Qr Decomposition, Shakhina Abdimajidovna Pulatova

Masters Theses

The increasing availability of whole genome sequences in public databases has stimulated the development of new methods to automatically compare and categorize genes and species. Recently developed methods based on the singular value decomposition (SVD) allow for the simultaneous identification and definition of well concerved motifs and gene families using very large whole genome datasets. In contrast, this work discusses the use of a truncated pivoted QR factorization as a scalable alternative to the SVD for comparing whole genomes in a phylogenetic context. This algorithm computes the R factor of the decomposition without forming the Q factor or altering the …


Finding Functional Gene Relationships Using The Semantic Gene Organizer (Sgo), Kevin Erich Heinrich Aug 2004

Finding Functional Gene Relationships Using The Semantic Gene Organizer (Sgo), Kevin Erich Heinrich

Masters Theses

Understanding functional gene relationships is a major challenge in bioninformatics and computational biology. Currently, many approaches extract gene relationships via term co-occurrence models from the biomedical literature. Unfortunately, however, many genes that are experimentally identified to be related have not been previously studied together. As a result, many automated models fail to help researchers understand the nature of the relationships. In this work, the particular schema used tomine genomic data is called Latent
Semantic Indexing (LSI). LSI performs a singular-value decomposition (SVD) to produce a low-rank approximation of the data set. Effectively, it allows queries to be interpreted in a …


Productivity Analysis And Use Of Sequence-Based Specification In A Web-Development Environment, Carla Renee Sparks Aug 2004

Productivity Analysis And Use Of Sequence-Based Specification In A Web-Development Environment, Carla Renee Sparks

Masters Theses

This study evaluates the productivity of a software team in a web-development company and assesses the effects of the sequence-based specifications process on productivity and software accuracy in this environment. This study compares two software projects completed at GoTrain Corporation in 2001 and 2002. GoTrain is an application service provider and delivers environmental, safety and health (ES&H) training courses to a variety of clients through an Internet-based learning management system (LMS), called the Academy.

GoTrain was established in 1999 through the merger of two small companies – a training services organization and a web design group. Because neither of the …


Software Reconfigurability For Heterogeneous Robot Cooperation, Maureen Chandra May 2004

Software Reconfigurability For Heterogeneous Robot Cooperation, Maureen Chandra

Masters Theses

Previous work in multi-robot cooperation has aimed at gaining autonomy and fault tolerance in the robot team. Most attempt to accomplish this by dynamically assigning roles or tasks to the robots and pre-designing the solution for heterogeneous robot teams with known sensing capabilities. However, pre-designed solutions fail when changes occur in the robot team composition or in the available environmental sensors at run-time. Automated solution design is thus needed to accomplish autonomy and fault tolerance in multi-agent systems. Very little work has been done in automating robot solutions at run-time due to the difficulty of adapting to many unexpected events …


A Similarity Based Concordance Approach To Word Sense Disambiguation, Ramakrishnan B. Guru Jan 2004

A Similarity Based Concordance Approach To Word Sense Disambiguation, Ramakrishnan B. Guru

Masters Theses

This study attempts to solve the problem of Word Sense Disambiguation using a combination of statistical, probabilistic and word matching algorithms. These algorithms consider that words and sentences have some hidden similarities and that the polysemous words in any context should be assigned to a sense after each execution of the algorithm. The algorithm was tested with sufficient sample data and the efficiency of the disambiguation performance has proven to increase significantly after the inclusion of the concordance methodology.