Squeak squeak, we’ve got a new release out 🐭. In Ensembl 103 we’ve got the latest mouse genome assembly, updates to human genes, variation and regulation, new and updated genomes for vertebrates, metazoa and plants and a new tool for converting variant descriptions.
The latest mouse genome from the Genome Reference Consortium, GRCm39, is now available in Ensembl. This means a change in chromosome coordinates, but it also means that 370 reported issues with the assembly have been resolved. If you have data on the old assembly, GRCm38, you can still access this via our release 102 archive site.
The new assembly means lots of things have been updated, including mapping all our genes onto the new assembly and updating the annotation to GENCODE M26. We’ve also re-run our murinae multiple whole genome alignment using EPO, which includes 16 Mus musculus strains, along with Steppe mouse, Algerian mouse, Ryuku mouse, shrew mouse and rat.
We continue to develop our MANE (Matched Annotation from NCBI and EBI) collaboration with RefSeq to identify a single representative transcript for each protein coding gene, which matches 100% between the two databases, allowing consistent clinical reporting of genetic variants. To ensure all clinically important coding regions are covered by this initiative, we have added in MANE Plus Clinical annotation for a small number of genes which have mutually exclusive exons containing phenotype-associated genetic variants.
The Variant Recoder tool facilitates variant data reuse by converting a variety of different variant descriptions and identifiers to VCF and other standard naming conventions. It now has an online interface, available for all species, to allow you to convert between different notations for variant positions. The output of the Variant Recoder REST API endpoints has also been altered for improved precision.
There’s also new variation data for human, including an update to dbSNP 154. If you’re interested in population frequencies, you’ll be pleased to see the addition of the GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel, made up of whole genome variant calling from 7609 Japanese individuals.
We’ve had the Mastermind Genomic Search Engine plugin available for the offline VEP for a while now, and have now integrated it into the online and REST interfaces. In web VEP, we provide links to this database which reports gene, variant, CNV, disease, phenotype, and therapy evidence from millions of scientific articles.
Finally, the Regulatory build, which annotates promoters, enhancers etc onto genomic regions based on experimental data, has been updated. This includes major additional annotation on the Y chromosome.
We’ve got new genomes for some of our existing species, which means we’ve updated all the genes. The updated species:
- 🐼 Panda, Ailuropoda melanoleuca
- 🐟 Cod, Gadus morhua
- 🐟 Climbing perch, Anabas testudineus
- 🐟 Zig-zag eel, Mastacembelus armatus
- 🐵 Crab-eating macaque, Macaca fascicularis
We’re excited to release the results of our collaboration with the African Cassava Whitefly Project, with six newly assembled and annotated whitefly genomes, collected from strains in Nigeria, Uganda and Asia:
- Bemisia tabaci AsiaII-5
- Bemisia tabaci SSA1-SG1 Nig
- Bemisia tabaci SSA1-SG1 Ug
- Bemisia tabaci SSA2 Nig
- Bemisia tabaci SSA3 Nig
- Bemisia tabaci Sweetpotato Ug
These whitefly are a major pest in sub-Saharan Africa, affecting cassava, a staple food crop in the region. This project aims to increase sustainable productivity by understanding the biology of these pests and the causes of outbreaks and resistance. Another new species is the Greenhouse whitefly, Trialeurodes vaporariorum, included as a comparative species.
We have updated genomes for the Beadlet anemone (Actinia equina) and for 12 fruitflies of the Drosophila genus:
- Drosophila ananassae
- Drosophila erecta
- Drosophila grimshawi
- Drosophila melanogaster
- Drosophila mojavensis
- Drosophila persimilis
- Drosophila pseudoobscura
- Drosophila sechellia
- Drosophila simulans
- Drosophila virilis
- Drosophila willistoni
- Drosophila yakuba
- Triticum aestivum Arinalrfor
- Triticum aestivum Jagger
- Triticum aestivum Julius
- Triticum aestivum Lancer
- Triticum aestivum Landmark
- Triticum aestivum Mace
- Triticum aestivum Norin61
- Triticum aestivum Stanley
- Triticum aestivum Sy Mattis
Other new genomes from your kitchen or medicine cabinet are:
- Spelt, Triticum spelta
- Asparagus, Asparagus officinalis
- Quinoa, Chenopodium quinoa
- 🐨 Eucalyptus, Eucalyptus grandis
- Opium or breadseed poppy, Papaver somniferum
There is also a change to the species included in our gene tree and homology analysis. We are including a maximum of 100 genomes in this analysis, changing depending on data available with each release. You can find the list of included genomes in our repository.
We will no longer be dumping data in RDF format to our FTP site.