Comparative homology agreement search: an effective combination of homology-search methods

Proc Natl Acad Sci U S A. 2004 Sep 21;101(38):13814-9. doi: 10.1073/pnas.0405612101. Epub 2004 Sep 14.

Abstract

Many methods have been developed to search for homologous members of a protein family in databases, and the reliability of results and conclusions may be compromised if only one method is used, neglecting the others. Here we introduce a general scheme for combining such methods. Based on this scheme, we implemented a tool called comparative homology agreement search (chase) that integrates different search strategies to obtain a combined "E value." Our results show that a consensus method integrating distinct strategies easily outperforms any of its component algorithms. More specifically, an evaluation based on the Structural Classification of Proteins database reveals that, on average, a coverage of 47% can be obtained in searches for distantly related homologues (i.e., members of the same superfamily but not the same family, which is a very difficult task), accepting only 10 false positives, whereas the individual methods obtain a coverage of 28-38%.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein*
  • Enzymes / chemistry
  • Enzymes / genetics
  • Evolution, Molecular*
  • False Positive Reactions
  • Phylogeny
  • Proteins / chemistry
  • Proteins / genetics
  • Reproducibility of Results

Substances

  • Enzymes
  • Proteins