Mapping the C. elegans noncoding transcriptome with a whole-genome tiling microarray

Genome Res. 2007 Oct;17(10):1471-7. doi: 10.1101/gr.6611807. Epub 2007 Sep 4.

Abstract

The number of annotated protein coding genes in the genome of Caenorhabditis elegans is similar to that of other animals, but the extent of its non-protein-coding transcriptome remains unknown. Expression profiling on whole-genome tiling microarrays applied to a mixed-stage C. elegans population verified the expression of 71% of all annotated exons. Only a small fraction (11%) of the polyadenylated transcription is non-annotated and appears to consist of approximately 3200 missed or alternative exons and 7800 small transcripts of unknown function (TUFs). Almost half (44%) of the detected transcriptional output is non-polyadenylated and probably not protein coding, and of this, 70% overlaps the boundaries of protein-coding genes in a complex manner. Specific analysis of small non-polyadenylated transcripts verified 97% of all annotated small ncRNAs and suggested that the transcriptome contains approximately 1200 small (<500 nt) unannotated noncoding loci. After combining overlapping transcripts, we estimate that at least 70% of the total C. elegans genome is transcribed.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Caenorhabditis elegans / genetics*
  • Caenorhabditis elegans Proteins / genetics
  • Chromosome Mapping / methods*
  • Exons
  • Gene Expression Profiling
  • Genome, Helminth
  • Genomics
  • Oligonucleotide Array Sequence Analysis / methods*
  • RNA, Helminth / genetics
  • RNA, Untranslated / genetics
  • Transcription, Genetic

Substances

  • Caenorhabditis elegans Proteins
  • RNA, Helminth
  • RNA, Untranslated