A robust meta-classification strategy for cancer diagnosis from gene expression data

Proc IEEE Comput Syst Bioinform Conf. 2005:322-5. doi: 10.1109/csb.2005.7.

Abstract

One of the major challenges in cancer diagnosis from microarray data is to develop robust classification models which are independent of the analysis techniques used and can combine data from different laboratories. We propose a meta-classification scheme which uses a robust multivariate gene selection procedure and integrates the results of several machine learning tools trained on raw and pattern data. We validate our method by applying it to distinguish diffuse large B-cell lymphoma (DLBCL) from follicular lymphoma (FL) on two independent datasets: the HuGeneFL Affmetrixy dataset of Shipp et al. (www. genome.wi.mit.du/MPR /lymphoma) and the Hu95Av2 Affymetrix dataset (DallaFavera's laboratory, Columbia University). Our meta-classification technique achieves higher predictive accuracies than each of the individual classifiers trained on the same dataset and is robust against various data perturbations. We also find that combinations of p53 responsive genes (e.g., p53, PLK1 and CDK2) are highly predictive of the phenotype.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Biomarkers, Tumor / analysis*
  • Diagnosis, Computer-Assisted / methods*
  • Discriminant Analysis
  • Gene Expression Profiling / methods*
  • Humans
  • Meta-Analysis as Topic*
  • Neoplasm Proteins / analysis*
  • Neoplasms / classification
  • Neoplasms / diagnosis*
  • Neoplasms / metabolism*
  • Oligonucleotide Array Sequence Analysis / methods
  • Pattern Recognition, Automated / methods
  • Reproducibility of Results
  • Sensitivity and Specificity

Substances

  • Biomarkers, Tumor
  • Neoplasm Proteins