Ensembl 111

Subset of genes omitted from protein trees in Metazoa Compara

Affects: Live site, Ensembl 100, Ensembl 101, Ensembl 102, Ensembl 103, Ensembl 104, Ensembl 105, Ensembl 106, Ensembl 107, Ensembl 108, Ensembl 109, Ensembl 110, Ensembl 111

Expected fix: Ensembl 112

Description:

During a cross-check of gene members used in comparative analyses such as protein trees, the gene sets of a number of genomes in Plants, Metazoa and Pan Compara were found to be incomplete in their respective Compara databases relative to the corresponding core databases. As a result, the affected genes have been inadvertently omitted from inference of protein trees and homologies.

Plant species Nicotiana attenuata and Metazoa species Exaiptasia diaphana (Sea anemone, CC7) have been particularly badly affected by this issue, with respectively 12,351 (37%) and 13,315 (59%) of their genes omitted from comparative analyses.

To address this, gene sets of most affected genomes will be reloaded from their respective core databases in Ensembl 112 as follows.

Plants species:

  • Chondrus crispus
  • Nicotiana attenuata

Pan Compara species:

  • Amphimedon queenslandica (Demosponge)
  • Anopheles gambiae (African malaria mosquito, PEST)
  • Apis mellifera (Honey bee, DH4)
  • Daphnia pulex (Common water flea, KAP4)
  • Drosophila melanogaster (Fruit fly)
  • Gigantopelta aegis (Deep sea snail, Gae_Host)
  • Hydra vulgaris (Swiftwater hydra, 105)
  • Lingula anatina (Lamp shell, Amm_Jpn)
  • Lytechinus variegatus (Green sea urchin, NC3)
  • Parasteatoda tepidariorum (Common house spider, Goettingen)

Metazoa species:

  • Acanthaster planci (Crown-of-thorns starfish)
  • Acropora millepora (Stony coral, JS-1)
  • Amphibalanus amphitrite (Acorn barnacle, SeventyFive)
  • Anopheles gambiae (African malaria mosquito, PEST)
  • Apis mellifera (Honey bee, DH4)
  • Aplysia californica (California sea hare, F4 #8)
  • Bombyx mori (Domestic silkworm, p50T)
  • Centruroides sculpturatus (Bark scorpion, CEXI.00-Female)
  • Cimex lectularius (Bed bug, Harlan)
  • Crassostrea virginica (Eastern oyster, RU13XGHG1-28)
  • Eurytemora affinis (Calanoid copepod, Atlantic clade)
  • Exaiptasia diaphana (Sea anemone, CC7)
  • Gigantopelta aegis (Deep sea snail, Gae_Host)
  • Homarus americanus (American lobster, GMGI-L3)
  • Hyalella azteca (Amphipod, HAZT.00-mixed)
  • Hydra vulgaris (Swiftwater hydra, 105)
  • Hypsibius exemplaris (Water bear tardigrade, Z151)
  • Limulus polyphemus (Atlantic horseshoe crab)
  • Lingula anatina (Lamp shell, Amm_Jpn)
  • Lytechinus variegatus (Green sea urchin, NC3)
  • Mercenaria mercenaria (Hard clam (quahog), YKG-2019)
  • Orbicella faveolata (Mountainous star coral, FL)
  • Parasteatoda tepidariorum (Common house spider, Goettingen)
  • Penaeus japonicus (Kuruma shrimp, Ginoza2017)
  • Penaeus monodon (Black tiger shrimp, SGIC_2016)
  • Pollicipes pollicipes (Goose neck barnacle, AB1234)
  • Priapulus caudatus (Penis worm)
  • Strongylocentrotus purpuratus (Purple sea urchin, Spur 01)
  • Stylophora pistillata (Hood coral, CSM Monaco)

Gene sets of the remaining affected species — all of which are in Ensembl Metazoa — will be reloaded in Ensembl 113:

  • Atta cephalotes (Leaf-cutter ant)
  • Bombus impatiens (Common eastern bumblebee)
  • Culex quinquefasciatus (Southern house mosquito, JHB)
  • Glossina fuscipes (Tsetse fly, IAEA_lab_2018)
  • Musca domestica (House fly, aabys)
  • Nasonia vitripennis (Jewel wasp, AsymCx)
  • Solenopsis invicta (Red fire ant, M01_SB)
  • Stomoxys calcitrans (Stable fly, USDA)
Workaround:

There is currently no workaround.

Canonical sequence discrepancy affecting 471 genes in Plants Compara

Affects: Live site, Ensembl 100, Ensembl 101, Ensembl 102, Ensembl 103, Ensembl 104, Ensembl 105, Ensembl 106, Ensembl 107, Ensembl 108, Ensembl 109, Ensembl 110, Ensembl 111

Expected fix: Ensembl 112

Description:

During routine pre-release checks of the Ensembl site, it was found that different canonical sequences had been used in Plants and Pan Compara for 3,299 of 34,310 genes (9.6%) in Brachypodium distachyon, and 471 of 17,743 genes (2.7%) in Chlamydomonas reinhardtii. Further investigation confirmed that the Plants Compara gene members were inconsistent with their corresponding core genes.

This discrepancy will be fixed by reloading the affected gene members from their core database. Chlamydomonas reinhardtii gene members will be reloaded in Ensembl 112, while Brachypodium distachyon gene members are expected to be reloaded in Ensembl 113.

Workaround:

There is currently no workaround.

Rice genomes have missing/incongruent data

Affects: Live site, Ensembl 111

Expected fix: Ensembl 112

Description:
During processing of protein features for plant genomes, three rice genomes presented with missing/problematic data. The databases affected are:
    • * oryza_sativa_ir64_core_58_111_1
    • ** Error: Orphaned object_xref’s
    • ** New keys for existing identity_xref
    • ** Error: Orphaned ontology_xref’s
    • ** New keys for existing xref’s
    • ** Error: Orphaned object_xref’s
    • ** New keys for existing analysis
    • * oryza_sativa_n22_core_58_111_1
    • ** Error: Missing InterPro term names and descriptions
    • ** Error: Ascensions do not match sequences protein features
    • * oryza_sativa_arc_core_58_111_1
    • ** Error: *Missing data from DNA table*
    • ** Error: Orphaned object_xref’s
    • ** New keys for existing analysis

The new keys indicate that a key has a new id when no change should have been made.

Workaround:

Proposed workaround is to use the data from release 110. Unfortunately, no alphafold or updated protein features will be available. This is not expected to affect Compara analyses.

Common name misspelled obscuring search function of two bivalve molluscs

Affects: Live site, Ensembl 111

Expected fix: Ensembl 112

Description:
Two species of mussel (Limnoperna fortunei and Dreissena polymorpha) contain a misspelled name in their species.common_name and species.display_name. This misspelling means that these species will not be discovered web search by way of their common name ‘mussel’.
Workaround:

Search for the full scientific name for your species of interest (Limnoperna fortunei and Dreissena polymorpha)

Missing Minor Allele Frequency (MAF) data

Affects: Live site, Ensembl 111, Ensembl 112, GRCh37

Expected fix: Ensembl 112 for fixing GRCh37. Ensembl 113 for fixing live site (GRCh38)

Description:

During the update to dbSNP156 in Ensembl 111 Minor Allele Frequency (MAF) data was not imported directly to the database. This means that variants do not have an associated MAF displayed in the variant summary page as well as in the variant table in the gene tab and transcript tab in the browser. This issue is also affecting the display of variation data on the sequence views when filtering based on MAF.

This data is also missing from BioMart.

Workaround:

Use the Ensembl 110 archive, which is unaffected: https://jul2023.archive.ensembl.org/index.html