Experimental data of a single promoter can be used for in silico detection of genes with related regulation in the absence of sequence similarity

Mamm Genome. 2001 Jan;12(1):67-72. doi: 10.1007/s003350010219.

Abstract

Gene expression is presently a major focus in genome analysis, and the experimental data on regulatory mechanisms and functional transcription factor binding sites are steadily growing. However, the annotation of transcriptional regulation of sequences cannot keep pace with the exponential growth of sequence databases. Employing detailed experimental data of a single promoter or enhancer to predict genes with similar regulation would provide a powerful method to link the literature about transcriptional regulation and sequence databases. To this end, we used information on individual functional transcription factor binding sites to compose in silico promoter and enhancer models of muscle-specific genes and to analyze the rodents section of EMBL with these models. Exhaustive evaluation of all hits revealed every second to third match to be a muscle-associated gene. Moreover, functionally related regulatory regions were detected by our model-based approach even in the absence of sequence similarity. We believe that this new approach is a substanial extension to database analysis by BLAST or FASTA, which are restricted to sequence similarity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Binding Sites
  • Calcium-Transporting ATPases / genetics
  • Computational Biology*
  • Databases, Factual*
  • Enhancer Elements, Genetic
  • Gene Expression Regulation*
  • Models, Genetic
  • Molecular Sequence Data
  • Muscles / metabolism
  • Promoter Regions, Genetic*
  • Rodentia / genetics
  • Sarcoplasmic Reticulum Calcium-Transporting ATPases
  • Sp1 Transcription Factor / metabolism
  • Transcription Factors / metabolism*

Substances

  • Sp1 Transcription Factor
  • Transcription Factors
  • Sarcoplasmic Reticulum Calcium-Transporting ATPases
  • Calcium-Transporting ATPases

Associated data

  • GENBANK/AF042092
  • GENBANK/AF051909
  • GENBANK/D00618
  • GENBANK/D17553
  • GENBANK/J04971
  • GENBANK/L21905
  • GENBANK/M13756
  • GENBANK/M33834
  • GENBANK/M36684
  • GENBANK/M37984
  • GENBANK/M57409
  • GENBANK/U49920
  • GENBANK/X73887