Important changes of data availability in Ensembl gene trees and BioMart

In Ensembl 103, and subsequent releases, we will be limiting the number of species represented in our gene trees and the  BioMart data export tool. These changes are required to most effectively support our rapid growth in the number of available species.

A common set of 200 species will be used to construct the gene trees and will have data available for download from BioMart. The list of species can be found on the following GitHub page. These species represent the most frequently accessed and highest quality datasets across the breadth of the taxonomic space within Ensembl.

As part of our strategy to support an increasing number of species, new genome assemblies and annotations are being continuously released via the Ensembl Rapid Release site, which already presents annotation for 185 genomes since launching in June 2020. We plan to make initial homology data available for species on Rapid Release in the next few months and gradually increase the types of comparative data we provide as we redesign our pipelines over the next two years.

In the meantime, BioMart and the previously constructed gene trees will remain available through the archive sites for all species that were available at the time of that release. Otherwise, you may be able to download the data you need from our FTP site or, more flexibly, from our APIs

If you have any questions relating to the upcoming changes in data availability, please do hesitate to contact our helpdesk.