An information-theoretic approach to the prediction of protein structural class

J Comput Chem. 2010 Apr 30;31(6):1201-6. doi: 10.1002/jcc.21406.

Abstract

An information-theoretical approach, which combines a sequence decomposition technique and a fuzzy clustering algorithm, is proposed for prediction of protein structural class. This approach could bypass the process of selecting and comparing sequence features as done previously. First, distances between each pair of protein sequences are estimated using a conditional decomposition technique in information theory. Then, the fuzzy k-nearest neighbor algorithm is used to identify the structural class of a protein given as set of sample sequences. To verify the strength of our method, we choose three widely used datasets constructed by Chou and Zhou. It is shown by the Jackknife test that our approach represents an improvement in the prediction of accuracy over existing methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Information Theory
  • Models, Chemical*
  • Proteins / chemistry*
  • Proteins / classification
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins