An integrated model for detecting significant chromatin interactions from high-resolution Hi-C data

Nat Commun. 2017 May 17:8:15454. doi: 10.1038/ncomms15454.

Abstract

Here we present HiC-DC, a principled method to estimate the statistical significance (P values) of chromatin interactions from Hi-C experiments. HiC-DC uses hurdle negative binomial regression account for systematic sources of variation in Hi-C read counts-for example, distance-dependent random polymer ligation and GC content and mappability bias-and model zero inflation and overdispersion. Applied to high-resolution Hi-C data in a lymphoblastoid cell line, HiC-DC detects significant interactions at the sub-topologically associating domain level, identifying potential structural and regulatory interactions supported by CTCF binding sites, DNase accessibility, and/or active histone marks. CTCF-associated interactions are most strongly enriched in the middle genomic distance range (∼700 kb-1.5 Mb), while interactions involving actively marked DNase accessible elements are enriched both at short (<500 kb) and longer (>1.5 Mb) genomic distances. There is a striking enrichment of longer-range interactions connecting replication-dependent histone genes on chromosome 6, potentially representing the chromatin architecture at the histone locus body.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Binding Sites / genetics
  • Cell Line, Tumor
  • Chromatin / genetics
  • Chromatin / metabolism*
  • Chromosome Mapping / methods
  • Chromosomes, Human, Pair 6 / genetics
  • Chromosomes, Human, Pair 6 / metabolism
  • Computational Biology / methods*
  • CpG Islands / genetics
  • Datasets as Topic
  • Genome / genetics*
  • Genomics / methods*
  • Histone Code / genetics
  • Humans
  • Mice
  • Models, Genetic*
  • Promoter Regions, Genetic / genetics
  • Software

Substances

  • Chromatin