Phylogenetic analyses of Pancrustacea have generally relied on empirical models of amino acid substitution estimated from large reference datasets and applied to the entire alignment. More recently, following the observation that different sites, or groups of sites, may evolve under different evolutionary constraints, methods have been developed to deal with site or site-class specific models. A set of three matrices has been here developed based on an alignment of complete mitochondrial pancrustacean genomes partitioned using an unsupervised clustering procedure acting over per-site physiochemical properties. The performance of the proposed matrix set - named MtPAN(3) - was compared to relevant single matrix models (MtZOA, MtART, MtPAN) under ML and BI. While the application of the new model does not solve some of the topological problems frequently encountered with pancrustacean mitogenomic phylogenetic analyses, MtPAN(3) largely outperforms its competitors based on AIC and Bayes factors, indicating a significantly improved fit to the empirical data. The applicability of the new model, as well as of multiple matrix models in general, is discussed and an R/BioPerl script that implements the procedure is provided.
Keywords: Amino acid matrices; Collembola; MtPAN(3); Pancrustacea; k-Means clustering.
Copyright © 2014 Elsevier Inc. All rights reserved.