from character congruence when all available
data are considered simultaneously. Alternative
approaches seek taxonomic congruence between
different data sets or data partitions, analyzed
separately ( 25). Either way, standard phylogenomic
practice entails some form of “majority rule,” based
on the assumption that as the amount of data
increases so will the probability of converging on
the correct species branching order [apart from
exceptional cases ( 26)]. Because of both ILS and
introgression, the historical species branching
order for the An. gambiae complex is represented
by only 1.9% of 50-kb windows across the entire
genome (Fig. 2). As a result, when we inferred a
ML tree for the An. gambiae complex on the
basis of whole-genome alignments, we recovered
the wrong species branching order supported by
100% of the bootstrap replicates at each node
(Fig. 1B, supplementary text S3, and figs. S17A
and S18A). The extent of autosomal introgression
in the An. gambiae complex has the paradoxical
effect that, as an increasing amount of the ge-
nome is sampled, support for the incorrect species
branching order is maximized.
Autosomal permeability of
Early cytotaxonomic evidence ( 10, 12, 27), as well
as more recent ribosomal DNA–based evidence
( 28, 29), supports rare occurrences (<0.1%) of natural female F1 hybrids between An. arabiensis–
An. gambiae + An. coluzzii, An. arabiensis–An.
quadriannulatus, and An. melas–An. gambiae +
An. coluzzii, although there are no reports of hybrids
involving An. merus. The evolutionary importance
of these rare hybrids as bridges to interspecific
gene flow has remained controversial. Inference
of the correct species branching order for the An.
gambiae complex (Fig. 1B) allowed a systematic
analysis of introgression across the genomes of
six members of this complex using the D (1, 4)
and DFOIL ( 30) statistics (supplementary text S4).
As expected, such tests revealed pervasive introgression across all autosomes between An. arabiensis
and the ancestor of An. gambiae + An. coluzzii
(figs. S24 and S25). Although introgression was
detected in both directions, the majority involved
genetic transfer from An. arabiensis into the ancestor of An. gambiae + An. coluzzii. This recent
and massive episode of introgression impedes our
ability to detect older introgression events between these species. Unexpectedly, we also found
evidence of extensive autosomal introgression
between another species pair, An. merus and An.
quadriannulatus (Fig. 4, supplementary text S3,
and figs. S24 and S25). One of the most striking
of the introgressed regions was a contiguous
block of genes coincident with the ~22-Mb 3La
chromosomal inversion ( 31). The corresponding
sequence originally present in ancestral populations of An. quadriannulatus has been entirely
replaced by its counterpart from An. merus, a
conclusion supported by the clustering of An.
merus with An. quadriannulatus in gene trees
constructed from sequences in the 3La inversion
(figs. S19, C and D, and S21B). Extant populations
of both of these species, and indeed all recognized species in the An. gambiae complex, are
fixed for the standard (3L+a) orientation except
An. melas and its putative sister species An.
bwambae, both fixed for the 3La orientation ( 31).
Considering that the exact ~22-Mb 3La region
was replaced between species whose contemporary
populations are collinear for 3L+a, it is conceivable
that ancestral An. quadriannulatus populations
originally carried 3La before the 3L+a
introgression. The expected reduced recombination between
1258524-4 2 JANUARY 2015 • VOL 347 ISSUE 6217 sciencemag.org SCIENCE
0Mb 10 20 30 40 50 60
Fig. 4. Introgression between An. merus and An. quadriannulatus. Chromoplots for all five chromosomal
arms show a highly spatially heterogeneous distribution of phylogenies inferred from 50-kb genomic regions,
particularly on 3L. The three possible rooted phylogenetic relationships for An. quadriannulatus (Q), An. melas
(L), and An. merus (R), with out-group An. christyi are shown, colored according to the key in the lower right.
The region on 3L corresponding to the 3La inversion shows strong evidence of R-Q introgression and a
strong negative deviation of the D statistic. The region on 2L corresponding to the 2La inversion is highly
enriched for the R(LQ) relationship, as expected, given that L and Q both have the 2L+a orientation, whereas
R has 2La (see Fig. 5). Across all the autosomes, the D statistic generally trends toward negative values,
which indicates weak or ancient R-Q introgression may have been occurring across the autosomes (see supplementary text S3).
2La/+ haplotype divergence
1.98 Ma ( ±0.22)
mer mel qua ara col gam mer mel qua ara col gam
2La phylogeny 2L+a phylogeny
1.85 Ma ( ±0.47)
0.89 Ma ( ±0.45)
Fig. 5. Ancient trans-specific polymorphism of an inversion predates radiation of the An. gambiae
complex. (A) All species in the complex are fixed for either the 2La (An. arabiensis, ara; An. merus, mer)
or 2L+a (An. melas, mel) orientation of the 2La rearrangement except An. gambiae (gam) and An.
coluzzii (col), which remain polymorphic for both orientations. The unique phylogenies observed in the
2La region (Fig. 2) are the result of differential loss (×) of the 2La and 2L+a orientations and the introgression of 2La from the ancestral population of An. gambiae + coluzzii into An. arabiensis (dotted
arrow). The overall divergence between the 2La and 2L+a orientations inferred from sequences inside
the inversion breakpoints is higher, on average, than the predicted divergence time of the species complex. (B) ML phylogeny inferred from sequences of the two different orientations of the 2La region (2La
and 2L+a) shows that the divergence between opposite orientations is greater than the divergence
between the same orientation present in different species (all nodes, 100% bootstrap support). Particularly notable is the separation of the sister taxa An. gambiae-2L+a and An. coluzzii-2La, as these are
known sister taxa. The scale bar denotes nucleotide divergence as calculated by RAxML.