Sequence assembly with CAFTOOLS

Genome Res. 1998 Mar;8(3):260-7. doi: 10.1101/gr.8.3.260.

Abstract

Large-scale genomic sequencing requires a software infrastructure to support and integrate applications that are not directly compatible. We describe a suite of software tools built around the Common Assembly Format (CAF), a comprehensive representation of a sequence assembly as a text file. These tools form the backbone of sequencing informatics at the Sanger Centre and the Genome Sequencing Center. The CAF format is intentionally flexible, and our Perl and C libraries, which parse and manipulate it, provide powerful tools for creating new applications as well as wrappers to incorporate other software. The tools are available free by anonymous FTP from ftp://ftp.sanger.ac.uk/pub/badger/.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Base Sequence*
  • Computational Biology / methods
  • Databases, Factual
  • Gene Library
  • Genome*
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*