Estimating haplotype frequencies in pooled DNA samples when there is genotyping error

BMC Genet. 2005 May 19:6:25. doi: 10.1186/1471-2156-6-25.

Abstract

Background: Maximum likelihood estimates of haplotype frequencies can be obtained from pooled DNA using the expectation maximization (EM) algorithm. Through simulation, we investigate the effect of genotyping error on the accuracy of haplotype frequency estimates obtained using this algorithm. We explore model parameters including allele frequency, inter-marker linkage disequilibrium (LD), genotyping error rate, and pool size.

Results: Pool sizes of 2, 5, and 10 individuals achieved comparable levels of accuracy in the estimation procedure. Common marker allele frequencies and no inter-marker LD result in less accurate estimates. This pattern is observed regardless of the amount of genotyping error simulated.

Conclusion: Genotyping error slightly decreases the accuracy of haplotype frequency estimates. However, the EM algorithm performs well even in the presence of genotyping error. Overall, pools of 2, 5, and 10 individuals yield similar accuracy of the haplotype frequency estimates, while reducing costs due to genotyping.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • DNA / genetics*
  • Diagnostic Errors*
  • Gene Frequency*
  • Genotype
  • Haplotypes*
  • Humans
  • Likelihood Functions
  • Sample Size

Substances

  • DNA