Ensembl 105 has been released

Hooray! Ensembl 105 and Ensembl Genomes 52 have now been released. We have got updates to the human frequency data and new and updated genomes for vertebrates, fungi and plant species. We are also very excited to integrate AlphaFold data for Arabidopsis thaliana in Ensembl Plants

AlphaFold data

EMBL-EBI and DeepMind have partnered to create the freely available AlphaFold DB to provide protein structure predictions for 21 species. Ensembl 105 will see the integration of AlphaFold data for Arabidopsis thaliana. This data will be available on the transcript tab and displayed as an AlphaFoldDB 3D widget. This page also displays the data from the 1001G project that highlights the position of deleterious variants on this 3D structure. 

Ensembl already displays PDB 3D structures for human and many vertebrate species. Therefore, it’s even more exciting to see the integration of AlphaFoldDB 3D structures for plants starting with Arabidopsis thaliana. This plant species has interesting variation data with the 1001G project which means that we can showcase the variant highlighting on its 3D structure. Future releases will see the addition of the AlphaFold predictions models for Maize and Soybean species. 

View the Arabidopsis thaliana AlphaFold 3D structure in action here.

Interactive AlphaFold 3D structure for Ensembl protein: AT3G52430.1 in Arabidopsis thaliana 

Human

Vertebrates

We’ve got new genomes for some of our existing species, which means we’ve updated all the genes. The updated species are:

  • Marmoset (Callithrix jacchus) has been updated from ASM275486v1 to mCalJac1.pat.X
  • Rat (Rattus norvegicus) has been updated from Rnor 6.0 to mRatBN7.1
  • Orangutan (Pongo abelii) has been updated from PPYG2 to Susie PABv2
  • Dog (Canis lupus familiaris) reference genome changed from CanFam3.1 to ROS Cfam 1.0 Labrador retriever
  • Dog boxer (Canis lupus familiaris) has been updated from CanFam3.1 to Dog10K Boxer Tasha 
  • Olive baboon (Papio anubis) has been updated from Panu 3.0 to Panubis1.0

Fungi

We’re excited to release a major update to Ensembl Fungi. We have added 477 new fungal assemblies and gene annotations imported from the European Nucleotide Archive (ENA), expanding our representation of ascomycetes, basidiomycetes and others. Furthermore, we have 15 new genomes originating from VEuPathDb’s fungal database (FungiDB), a new fungal gene tree, updated BioMarts and re-mapping of pathogen-host interaction phenotypes from PHI-base onto fungal genes. More details available on the Ensembl Fungi blog.

Plants

Ensembl plants will have a few new assemblies and some updates to the existing assemblies:

New assemblies
  • European olive tree (Olea europaea)  
  • Yellow sarson (Brassica rapa R-o-18: new reference)                                                                            
  • Hazel (Corylus avellana)                                                                                                  
  • Common fig (Ficus carica)                                                                                                  
  • Lettuce (Lactuca sativa)   
Other Plant updates
  • Sunflower (Helianthus annuus) has been updated to HanXRQr2.0 SUNRISE assembly
  • Barley reference genome has been updated to Morex V3 assembly with IPK and BaRT gene models
  • Masking of repeated sequences in many plant genomes using Red
  • Brassica rapa Chiifu401 42 has been downgraded to extra cultivar in strain set
  • Gene name synonyms from phytozome have been added for Chlamydomonas reinhardtii 
  • Plant reactome cross-references have been added for Zea mays
  • Centromere data has been added to Triticum aestivum 
  • SIFT predictions have been added to VEP output data for several plant species with variation data
  • De novo genes have been added for chromosome level Triticum aestivum cultivars from the wheat 10+ genome project

Other Updates and Highlights

  • Retirement of Ensembl 67 and Ensembl 85 archive sites
  • Decommissioning of REST API eQTL endpoint (More information here)
  • VariantRecoder REST service will now return: variant names in other databases (UniProt, ClinVar, ClinGen Allele Archive), GA4GH, VRS representation and MANE annotation
  • VariantRecoder web tool will now return: variant names in other databases (UniProt, ClinVar, ClinGen Allele Archive) and MANE annotation
  • Support for BCF files is now available
  • Taxon ID has been added to the JSON dump files
  • Bed bug (Cimex lectularius) assembly has been updated from ClecH1 to Clec_2.1 (annotation from Refseq)
  • New BLAST option controlling the number of High Scoring Segment Pair