Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

2009

Selected Works

PDF

Proteins

Articles 1 - 2 of 2

Full-Text Articles in Engineering

Protein Family Classification Using Structural And Sequence Information, Jennifer A. Smith May 2009

Protein Family Classification Using Structural And Sequence Information, Jennifer A. Smith

Jennifer A. Smith

Protein family classification usually relies on sequence information (as in the case of hidden Markov models and position-specific scoring matrices) or on structural information where some sort of average positional error between the atomic locations is used. The positional error method requires that the structure of all the proteins to be classified is known. Sequence methods have the advantage that a much larger number of proteins can be classified (since far more sequences are know than structures). However, sequence methods discard a large amount of useful information contained in the structures of the subset of proteins in the family for …


Searching For Protein Classification Features, Jennifer A. Smith May 2009

Searching For Protein Classification Features, Jennifer A. Smith

Jennifer A. Smith

A genetic algorithm is used to search for a set of classification features for a protein superfamily which is as unique as possible to the superfamily. These features may then be used for very fast classification of a query sequence into a protein superfamily. The features are based on windows onto modified consensus sequences of multiple aligned members of a training set for the protein superfamily. The efficacy of the method is demonstrated using receiver operating characteristic (ROC) values and the performance of resulting algorithm is compared with other database search algorithms.