Development and public release of a comprehensive hepatitis virus database

Hepatol Res. 2008 Mar;38(3):234-43. doi: 10.1111/j.1872-034X.2007.00262.x. Epub 2007 Sep 17.

Abstract

Aim: Currently, approximately 44 000 hepatitis C virus (HCV), 11 000 hepatitis B virus (HBV), and 1600 hepatitis E virus (HEV) sequences are available at the International Nucleotide Sequence Database Collaboration (INSDC, previously known as DDBJ/EMBL/GenBank), and the number of these virus sequences is growing rapidly. However, since INDSC is not specialized to hepatitis viruses, it is difficult to retrieve information of virological or clinical interests from it. Thus, it is quite worthwhile to construct a specialized database for the hepatitis virus sequences and to make it accessible to researchers worldwide.

Methods: We developed a WWW-based database hepatitis virus database (HVDB), which contains all the HCV, HBV, and HEV sequences available at INSDC. In the HVDB, all piece sequences obtained from INSDC are arranged to the genomesequence of each virus. Also given in the database are the phylogenetic relationships of each locus on the genome among variants for each virus.

Results: Users of the database can easily retrieve entries (sequences with annotations) of the specific genotype by referring to the phylogenetic relationships or those of specific loci by referring to the genome map information. HVDB provides users with a tool for phylogenetic analysis that can be used in combination with the data retrieval tools.

Conclusion: The latest release is publicly accessible at the HVDB website: http://s2as02.genes.nig.ac.jp.