Figure 1—figure supplement 1.
The amount of sequence data populating the BigWenDB is shown together with its phylogenetic distribution. The coloured bars at the perimeter (red, green, blue) document the contributions of three different sequence sources (bar height proportional to the number of sequences, see ruler at top left): (1) Sequences from 204 opisthokonts (animals, choanoflagellates, and fungi) with >8000 entries in the NCBI database (downloaded on May 25, 2015; coloured in red). (2) Sequences derived from the transcriptomes of 64 species under-represented at NCBI (non-bilaterian animals, lophotrochozoans, and representatives of additional phyla; green). (3) ORFs derived from the genome sequences of 25 representative metazoans (blue), including 8 non-bilaterian species. In total, 124,031,501 sequences from 273 species cover the eukaryotic tree of life in the most comprehensive way so far (see text for details). Phylogenetic relationships after NCBI taxonomy.