Trajectory clustering: a non-parametric method for grouping gene expression time courses, with applications to mammary development

Pac Symp Biocomput. 2003:351-62. doi: 10.1142/9789812776303_0033.

Abstract

Trajectory clustering is a novel and statistically well-founded method for clustering time series data from gene expression arrays. Trajectory clustering uses non-parametric statistics and is hence not sensitive to the particular distributions underlying gene expression data. Each cluster is clearly defined in terms of direction of change of expression for successive time points (its 'trajectory'), and therefore has easily appreciated biological meaning. Applying the method to a dataset from mouse mammary gland development, we demonstrate that it produces different clusters than Hierarchical, K-means, and Jackknife clustering methods, even when those methods are applied to differences between successive time points. Compared to all of the other methods, trajectory clustering was better able to match a manual clustering by a domain expert, and was better able to cluster groups of genes with known related functions.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Cluster Analysis
  • Female
  • Gene Expression Profiling / statistics & numerical data*
  • Mammary Glands, Animal / embryology
  • Mammary Glands, Animal / growth & development*
  • Mammary Glands, Animal / metabolism*
  • Mice
  • Models, Biological
  • Models, Genetic
  • Pregnancy
  • Statistics, Nonparametric