Inference of the haplotype effect in a matched case-control study using unphased genotype data

Int J Biostat. 2008 May 8;4(1):Article 6. doi: 10.2202/1557-4679.1079.

Abstract

Typically locus specific genotype data do not contain information regarding the gametic phase of haplotypes, especially when an individual is heterozygous at more than one locus among a large number of linked polymorphic loci. Thus, studying disease-haplotype association using unphased genotype data is essentially a problem of handling a missing covariate in a case-control design. There are several methods for estimating a disease-haplotype association parameter in a matched case-control study. Here we propose a conditional likelihood approach for inference regarding the disease-haplotype association using unphased genotype data arising from a matched case-control study design. The proposed method relies on a logistic disease risk model and a Hardy-Weinberg equilibrium (HWE) among the control population only. We develop an expectation and conditional maximization (ECM) algorithm for jointly estimating the haplotype frequency and the disease-haplotype association parameter(s). We apply the proposed method to analyze the data from the Alpha-Tocopherol, Beta-Carotene Cancer prevention study, and a matched case-control study of breast cancer patients conducted in Israel. The performance of the proposed method is evaluated via simulation studies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Anticarcinogenic Agents / administration & dosage
  • Biostatistics / methods*
  • Breast Neoplasms / genetics
  • Case-Control Studies
  • Female
  • Genotype*
  • Haplotypes*
  • Humans
  • Likelihood Functions
  • Logistic Models
  • Male
  • Neoplasms / prevention & control
  • alpha-Tocopherol / administration & dosage
  • beta Carotene / administration & dosage

Substances

  • Anticarcinogenic Agents
  • beta Carotene
  • alpha-Tocopherol