Ensembl 104 and Ensembl Genomes 51 are out! This release features updates to human and mouse genes, GRCh37 variation and regulation, new assemblies and variation for vertebrates, new plant species and a large update of the available metazoa data. We also said bye-bye to clone-based gene names and welcomed the new Ensembl Canonical transcripts.
Major data updates for human
Ensembl 104 brings an update of the human gene set to GENCODE 38. We’ve also got a major upgrade of the Ensembl Canonical transcript definition, particularly affecting the human Canonical transcript selection. The new Ensembl Canonical prioritises well-supported biologically representative, highly expressed and highly conserved transcripts. MANE Select will be used as the canonical transcript for human protein coding genes where available. The Ensembl Canonical can be found at the top of the transcript table and is easily searchable with BioMart‘s brand new filter – ‘Ensembl Canonical’. We’ve also updated the GTF and GFF3 files on our FTP site to include these transcript flags. Importantly, these changes don’t affect the previous human genome assembly GRCh37 and you can learn more about them here. There are more treats coming your way, as GRCh37 has received a revamp of its regulatory build and variation data, including the latest data from dbSNP build 154, ClinVar and COSMIC.
Other vertebrates
We’ve got new variation data and new genomes for a couple of existing species, which also means an update of their gene sets.
New assemblies and updated genes:
- Anole lizard (Anolis carolinensis)
- Turkey (Meleagris gallopavo)
- Collared flycatcher (Ficedula albicollis)
- Turbot (Scophthalmus maximus)
Updated genes:
- Update of mouse (Mus musculus) genes to GENCODE M27
New variation:
- Nile tilapia (Oreochromis niloticus)
- American mink (Neovison vison)
- Great tit (Parus major)
- Rabbit (Oryctolagus cuniculus)
Plants
You will find a couple of newcomers among the plant species including walnut, sesame, as well as new barley and potato cultivars. We’ve got some updates for the well familiar old-timers too.
New species/cultivars:
- Persian walnut (Juglans regia)
- Sesame (Sesamum indicum)
- TRITEX barley (Hordeum vulgare cv. Morex) assembly
- Diploid potato (Solanum tuberosum) cultivar rh8903916
New assemblies and updated genes:
Other updates:
- Moss species Physcomitrella patens was renamed Physcomitrium patens
- Updated cross-references for Arabidopsis thaliana
Metazoa
We’ve got a major data update for 43 non-vertebrate animals including updates to the community-based gene annotations from the release 49 of VEuPathDB and recalculation of the variant effects for the updated genes across all species with variation data:
Other Updates
- Human, mouse, rat and zebrafish clone-based gene names were replaced by Ensembl stable IDs, in line with practices adopted for all other vertebrate species. You can find out more about it here.
- The VEP REST response has been updated for SpliceAI and DisGeNET to improve clarity. This is not backwards compatible
- Retirement of Ensembl 84 archive site