Ensembl 103 has been released

Squeak squeak, we’ve got a new release out 🐭. In Ensembl 103 we’ve got the latest mouse genome assembly, updates to human genes, variation and regulation, new and updated genomes for vertebrates, metazoa and plants and a new tool for converting variant descriptions.

🐭  Mouse

The latest mouse genome from the Genome Reference Consortium, GRCm39, is now available in Ensembl. This means a change in chromosome coordinates, but it also means that 370 reported issues with the assembly have been resolved. If you have data on the old assembly, GRCm38, you can still access this via our release 102 archive site.

The new assembly means lots of things have been updated, including mapping all our genes onto the new assembly and updating the annotation to GENCODE M26. We’ve also re-run our murinae multiple whole genome alignment using EPO, which includes 16 Mus musculus strains, along with Steppe mouse, Algerian mouse, Ryuku mouse, shrew mouse and rat.

😀  Human

We continue to develop our MANE (Matched Annotation from NCBI and EBI) collaboration with RefSeq to identify a single representative transcript for each protein coding gene, which matches 100% between the two databases, allowing consistent clinical reporting of genetic variants. To ensure all clinically important coding regions are covered by this initiative, we have added in MANE Plus Clinical annotation for a small number of genes which have mutually exclusive exons containing phenotype-associated genetic variants.

The Variant Recoder tool facilitates variant data reuse by converting a variety of different variant descriptions and identifiers to VCF and other standard naming conventions. It  now has an online interface, available for all species,  to allow you to convert between different notations for variant positions. The output of the Variant Recoder REST API endpoints has also been altered for improved precision.

There’s also new variation data for human, including an update to dbSNP 154. If you’re interested in population frequencies, you’ll be pleased to see the addition of the GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel, made up of whole genome variant calling from 7609 Japanese individuals.

We’ve had the Mastermind Genomic Search Engine plugin available for the offline VEP for a while now, and have now integrated it into the online and REST interfaces. In web VEP, we provide links to this database which reports gene, variant, CNV, disease, phenotype, and therapy evidence from millions of scientific articles.

Finally, the Regulatory build, which annotates promoters, enhancers etc onto genomic regions based on experimental data, has been updated. This includes major additional annotation on the Y chromosome.

Other vertebrates

We’ve got new genomes for some of our existing species, which means we’ve updated all the genes. The updated species:

Metazoa

We’re excited to release the results of our collaboration with the African Cassava Whitefly Project, with six newly assembled and annotated whitefly genomes, collected from strains in Nigeria, Uganda and Asia:

These whitefly are a major pest in sub-Saharan Africa, affecting cassava, a staple food crop in the region. This project aims to increase sustainable productivity by understanding the biology of these pests and the causes of outbreaks and resistance. Another new species is the Greenhouse whitefly, Trialeurodes vaporariorum, included as a comparative species.

We have updated genomes for the Beadlet anemone (Actinia equina) and for 12 fruitflies of the Drosophila genus:

🌱 Plants

For the agricultural community, we have nine new wheat 🍞 cultivars, sequenced as part of the 10+ Genome Project, added to the pangenome set:

Other new genomes from your kitchen or medicine cabinet are:

There is also a change to the species included in our gene tree and homology analysis. We are including a maximum of 100 genomes in this analysis, changing depending on data available with each release. You can find the list of included genomes in our repository.

Other

We will no longer be dumping data in RDF format to our FTP site.

Leave a Reply

Your email address will not be published. Required fields are marked *