Ensembl 90 is now live and it’s absolutely massive! Read on to find out why:
New species and annotation
Ensembl 90 is our biggest release ever in terms of species.
We’ve got 19 new or updated rodent genomes. Of these, sixteen were annotated with our new clade-based system, which makes use of the similarity between species’ genomes to automatically annotate genes onto the homologous regions:
- Kangaroo rat (update)
- Guinea pig (update)
- Golden hamster
- Damara mole rat
- Brazilian Guinea pig
- Algerian mouse
- Lesser Egyptian jerboa
- Prairie vole
- Naked mole rat (new; two assemblies available, one male and one female)
- Northern American deer mouse
- Long-tailed chinchilla
- Upper Galilee mountains blind mole rat
- Squirrel (update)
- Chinese hamster
We have also imported three new rodent genomes and their annotation:
- Ryukyu mouse – annotated by the UCSC Comparative Annotation Toolkit.
- Chinese hamster ovary – imported from Horizon Eagle.
- Shrew mouse – annotated by the UCSC Comparative Annotation Toolkit.
This is the first ever Ensembl release where we’ve imported annotation from external resources, but our rigorous quality control makes us confident that these species’ annotation will meet the high standard expected of Ensembl genes. It’s also the first time we’ve supported more than one genome assembly per species (naked mole rat and Chinese hamster) in one Ensembl database, which will allow you to continue to work with your preferred assembly, within the Ensembl framework. We plan to continue to import high quality gene sets, where available, and to use our quicker clade-based annotation, so expect lots more new genomes appearing in future Ensembl releases.
Aside from rodents, we’ve got a new pig genome assembly, Sscrofa11.1 from the Swine Genome Sequencing Consortium. The assembly is created from a single Duroc sow, named TJ Tabasco. The genome was annotated using species specific RNA-Seq data, PacBio long reads and cDNAs, as well as proteins from related vertebrates.
We also have updates to our human, mouse and zebrafish gene sets. This brings us to human GENCODE 27, updating to the human genome patch version GRCh38.p10 with the latest updates from the Ensembl automatic and Havana manual annotation, and mouse GENCODE M15, with the latest Ensembl and Havana genes. Zebrafish annotation incorporates new gene models from RNA-seq and adds pri-miRNAs to the other features database.
We have updates to our variation data coming in for human: COSMIC 81 somatic variants, HGMD 2016.4, dbSNP 150 and DGVa structural variants. For both our main database and our GRCh37 database, now have allele frequencies from TOPMed (Trans-Omics for Precision Medicine) and UK10K, and, for pre-existing dbSNP variants, gnomAD. We also have DGVa structural variant updates for Cow, Dog and Mouse.
We have updated phenotype data in Human (NHGRI-EBI GWAS Catalog, OMIM and MIM morbid, ClinVar, Cosmic Gene Census, DDG2P and Orphanet), Mouse (IMPC, MGI), Cat, Chicken, Chimpanzee, Cow, Dog, Horse, Macaque, Pig, Rat (RGD), Sheep, Turkey and Zebrafish (ZFIN).
Microarray probe mapping
Ensembl provide probe mapping for a number of popular commercially-available microarrays, mapping probesets to genomic loci and Ensembl genes. You can get mapping to genes through the transcript pages in the browser and BioMart. We have updated our probe mappings for:
- Ciona intestinalis
- Caenorhabditis elegans
- Mouse strains: 129S1/SvImJ, A/J, AKR/J, BALB/cJ, C3H/HeJ, C57BL/6NJ, CAST/EiJ, CBA/J, DBA/2J, FVB/NJ, LP/J, NOD/ShiLtJ, NZO/HlLtJ, PWK/PhJ, SPRET/EiJ and WSB/EiJ
- Saccharomyces cerevisiae
You’ll now be able to adjust the y-axis scale of custom wiggle tracks in the browser. We’ll be releasing a blog post about this soon.
File format updates
We’ve updated our sequence ontology terms in our GFF3 files to improve consistency and remove bugs. Read more in this blog post.
Find out more
We’ll be holding a release webinar on Wednesday 6th September at 4pm BST. Register here to learn more about the exciting updates to Ensembl, and ask your questions to the team.