Exploring phylogenomic relationships within Myriapoda: should high matrix occupancy be the goal?


Myriapods are one of the dominant terrestrial arthropod groups including the diverse and familiar centipedes and millipedes. Although molecular evidence has shown that Myriapoda is monophyletic, its internal phylogeny remains contentious and understudied, especially when compared to those of Chelicerata and Hexapoda. Until now, efforts have focused on taxon sampling (e.g., by including a handful of genes in many species) or on maximizing matrix occupancy (e.g., by including hundreds or thousands of genes in just a few species), but a phylogeny maximizing sampling at both levels remains elusive. In this study, we analyzed forty Illumina transcriptomes representing three myriapod classes (Diplopoda, Chilopoda and Symphyla); twenty-five transcriptomes were newly sequenced to maximize representation at the ordinal level in Diplopoda and at the family level in Chilopoda. Eight supermatrices were constructed to explore the effect of several potential phylogenetic biases (e.g., rate of evolution, heterotachy) at three levels of mean gene occupancy per taxon (50%, 75% and 90%). Analyses based on maximum likelihood and Bayesian mixture models retrieved monophyly of each myriapod class, and resulted in two alternative phylogenetic positions for Symphyla, as sister group to Diplopoda + Chilopoda, or closer to Diplopoda, the latter hypothesis having been traditionally supported by morphology. Within centipedes, all orders were well supported, but two nodes remained in conflict in the different analyses despite dense taxon sampling at the family level, situating the order Scolopendromorpha as sister group to a morphologically-anomalous grouping of Lithobiomorpha + Geophilomorpha in a subset of analyses. Interestingly, this anomalous result was obtained for all analyses conducted with the most complete matrix (90% of occupancy), being at odds not only with the sparser but more gene-rich supermatrices (75% and 50% supermatrices) or with the matrices optimizing phylogenegic informativeness and the most conserved genes, but also with previous hypotheses based on morphology, development or other molecular data sets. We discuss the implications of these findings in the context of the ever more prevalent quest for completeness in phylogenomic studies.

Publisher's Version

Last updated on 12/23/2015