MtPAN(3): site-class specific amino acid replacement matrices for mitochondrial proteins of Pancrustacea and Collembola

Mol Phylogenet Evol. 2014 Jun:75:239-44. doi: 10.1016/j.ympev.2014.02.001. Epub 2014 Feb 10.

Abstract

Phylogenetic analyses of Pancrustacea have generally relied on empirical models of amino acid substitution estimated from large reference datasets and applied to the entire alignment. More recently, following the observation that different sites, or groups of sites, may evolve under different evolutionary constraints, methods have been developed to deal with site or site-class specific models. A set of three matrices has been here developed based on an alignment of complete mitochondrial pancrustacean genomes partitioned using an unsupervised clustering procedure acting over per-site physiochemical properties. The performance of the proposed matrix set - named MtPAN(3) - was compared to relevant single matrix models (MtZOA, MtART, MtPAN) under ML and BI. While the application of the new model does not solve some of the topological problems frequently encountered with pancrustacean mitogenomic phylogenetic analyses, MtPAN(3) largely outperforms its competitors based on AIC and Bayes factors, indicating a significantly improved fit to the empirical data. The applicability of the new model, as well as of multiple matrix models in general, is discussed and an R/BioPerl script that implements the procedure is provided.

Keywords: Amino acid matrices; Collembola; MtPAN(3); Pancrustacea; k-Means clustering.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Substitution
  • Animals
  • Arthropods / classification*
  • Arthropods / genetics
  • Bayes Theorem
  • Cluster Analysis
  • Computational Biology
  • Genome, Mitochondrial*
  • Likelihood Functions
  • Mitochondrial Proteins / genetics*
  • Models, Genetic*
  • Phylogeny
  • Sequence Alignment
  • Sequence Analysis, DNA

Substances

  • Mitochondrial Proteins