Open Access. Powered by Scholars. Published by Universities.®

Electrical and Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Boise State University

Biology computing

Articles 1 - 7 of 7

Full-Text Articles in Electrical and Computer Engineering

Computation Intelligence Method To Find Generic Non-Coding Rna Search Models, Jennifer A. Smith May 2010

Computation Intelligence Method To Find Generic Non-Coding Rna Search Models, Jennifer A. Smith

Electrical and Computer Engineering Faculty Publications and Presentations

Fairly effective methods exist for finding new noncoding RNA genes using search models based on known families of ncRNA genes (for example covariance models). However, these models only find new members of the existing families and are not useful in finding potential members of novel ncRNA families. Other problems with family-specific search include large processing requirements, ambiguity in defining which sequences form a family and lack of sufficient numbers of known sequences to properly estimate model parameters. An ncRNA search model is proposed which includes a collection of non-overlapping RNA hairpin structure covariance models. The hairpin models are chosen from …


Rna Gene Finding With Biased Mutation Operators, Jennifer A. Smith Apr 2007

Rna Gene Finding With Biased Mutation Operators, Jennifer A. Smith

Electrical and Computer Engineering Faculty Publications and Presentations

The use of genetic algorithms for non-coding RNA gene finding has previously been investigated and found to be a potentially viable method for accelerating covariance-model-based database search relative to full dynamic-programming methods. The mutation operators in previous work chose new alignment insertion and deletion locations uniformly over the length of the model consensus sequence. Since the covariance models are estimated from multiple known members of a non-coding RNA family, information is available as to the likelihood of insertions or deletions at the individual model positions. This information is implicit in the state-transition parameters of the estimated covariance models. In the …


A Genetic Algorithms Approach To Non-Coding Rna Gene Searches, Jennifer A. Smith Jul 2006

A Genetic Algorithms Approach To Non-Coding Rna Gene Searches, Jennifer A. Smith

Electrical and Computer Engineering Faculty Publications and Presentations

A genetic algorithm is proposed as an alternative to the traditional linear programming method for scoring covariance models in non-coding RNA (ncRNA) gene searches. The standard method is guaranteed to find the best score, but it is too slow for general use. The observation that most of the search space investigated by the linear programming method does not even remotely resemble any observed sequence in real sequence data can be used to motivate the use of genetic algorithms (GAs) to quickly reject regions of the search space. A search space with many local minima makes gradient decent an unattractive alternative. …


Accelerated Non-Coding Rna Searches With Covariance Model Approximations, Jennifer A. Smith Jul 2006

Accelerated Non-Coding Rna Searches With Covariance Model Approximations, Jennifer A. Smith

Electrical and Computer Engineering Faculty Publications and Presentations

Covariance models (CMs) are a very sensitive tool for finding non-coding RNA (ncRNA) genes in DNA sequence data. However, CMs are extremely slow. One reason why CMs are so slow is that they allow all possible combinations of insertions and deletions relative to the consensus model even though the vast majority of these are never seen in practice. In this paper we examine reduction in the number of states in covariance models. A simplified CM with reduced states which can be scored much faster is introduced. A comparison of the results of a full CM versus a reduced-state model found …


Searching For Protein Classification Features, Jennifer A. Smith Jan 2005

Searching For Protein Classification Features, Jennifer A. Smith

Electrical and Computer Engineering Faculty Publications and Presentations

A genetic algorithm is used to search for a set of classification features for a protein superfamily which is as unique as possible to the superfamily. These features may then be used for very fast classification of a query sequence into a protein superfamily. The features are based on windows onto modified consensus sequences of multiple aligned members of a training set for the protein superfamily. The efficacy of the method is demonstrated using receiver operating characteristic (ROC) values and the performance of resulting algorithm is compared with other database search algorithms.


An Asynchronous Gals Interface With Applications, Jennifer A. Smith Jan 2004

An Asynchronous Gals Interface With Applications, Jennifer A. Smith

Electrical and Computer Engineering Faculty Publications and Presentations

A low-latency asynchronous interface for use in globally-asynchronous locally-synchronous (GALS) integrated circuits is presented. The interface is compact and does not alter the local clocks of the interfaced local clock domains in any way (unlike many existing GALS interfaces). Two applications of the interface to GALS systems are shown. The first is a single-chip shared-memory multiprocessor for generic supercomputing use. The second is an application-specific coprocessor for hardware acceleration of the Smith-Waterman algorithm. This is a bioinformatics algorithm used for sequence alignment (similarity searching) between DNA or amino acid (protein) sequences and sequence databases such as the recently completed human …


Protein Family Classification Using Structural And Sequence Information, Jennifer A. Smith Jan 2004

Protein Family Classification Using Structural And Sequence Information, Jennifer A. Smith

Electrical and Computer Engineering Faculty Publications and Presentations

Protein family classification usually relies on sequence information (as in the case of hidden Markov models and position-specific scoring matrices) or on structural information where some sort of average positional error between the atomic locations is used. The positional error method requires that the structure of all the proteins to be classified is known. Sequence methods have the advantage that a much larger number of proteins can be classified (since far more sequences are know than structures). However, sequence methods discard a large amount of useful information contained in the structures of the subset of proteins in the family for …