What’s coming in Ensembl release 112 / Ensembl Genomes 59?

Ensembl 112 and Ensembl Genomes 59 are expected in April 2024. Check out what we’re up to, although we can’t guarantee everything listed here will make it into the final release.

Regulation

We are transitioning our regulatory annotation over the next few releases to be based on open chromatin, rather than genomic segmentation of histone marks. As a necessary step, we will be removing segmentation data and tracks from human and mouse regulatory annotation in release 112. These tracks are not highly used, and we have found that they are hard to interpret since they don’t match our regulatory annotation or peaks in an obvious way.

In addition, we continue to refine our current annotation. Most notably for 112, our promoters will align with the 5’ ends of known transcripts (specifically 10 bp downstream).

New Assemblies and/or Annotation

Vertebrates

Amphiprion ocellaris (Clown anemone fish) – (GCA_022539595.1)

Anabas testudineus (Climbing perch) – (GCA_900324465.3)

Astatotilapia calliptera (Eastern happy) – (GCA_900246225.5)

Clupea harengus (Atlantic herring) – (GCA_900700415.2)

Denticeps clupeoides (Denticle herring) – (GCA_900700375.2)

Electrophorus electricus (Electric eel) – (GCA_013358815.1)

Esox lucius (Northern pike) – (GCA_011004845.1)

Gasterosteus aculeatus (three-spined stickleback) – (GCA_016920845.1)

Ictalurus punctatus (Channel catfish) – (GCA_004006655.3)

Oncorhynchus tshawytscha (Chinook salmon) – (GCA_018296145.1)

Oreochromis aureus (Guangdong) – (GCA_013358895.1)

Parambassis ranga (Indian glassy fish) – (GCA_900634625.2)

Periophthalmus magnuspinnatus (Bony fishes) – (GCA_009829125.3)

Pygocentrus nattereri (Red-bellied piranha) – (GCA_015220715.1)

Additional strains added for the following fish species:

Gadus morhua (Atlantic cod):

  • GCA_010882105.1 (Celtic sea)

Salmo salar (Atlantic salmon):

  • GCA_021399835.1 (North American Atlantic salmon)
  • GCA_923944775.1 (Brian)
  • GCA_931346935.2 (European origin)

Gasterosteus aculeatus (three-spined stickleback):

  • GCA_006232285.1 (Marine)
  • GCA_006232265.1 (Marine)
  • GCA_006229185.1 (Freshwater)

Non-Vertebrates

Plants:

New Genomes

Vicia faba (Faba bean)

Aegilops umbellulata (Umbel goatgrass)

Updated species

Manihot esculenta (Cassava)

Medicago truncatula (Barrelclover)

Metazoa:

New Drosophila Pangenome

We will introduce a new Drosophila genus wide pangenome which will incorporate resources from the main site metazoa site.

This pangenome will cover 36 species of Drosophila and 4 outgroup species. These species are currently hosted on both Ensembl metazoa and Rapid Release

New species:

Bactrocera neohumeralis (Pestiferous fruit fly) – (GCA_024586455.2) 

Cherax quadricarinatus (Australian freshwater crayfish) – (GCA_026875155.2) 

Coremacera marginata (March fly or Snail killer fly) – (GCA_914767935.1) 

Ctenocephalides felis (Cat flea) – (GCA_003426905.1) 

Daphnia carinata (Water flea) – (GCA_022539665.3) 

Diaphorina citri (Asian citrus psyllid) – (GCA_000475195.1) 

Drosophila albomicans (Fly) – (GCA_009650485.2) 

Drosophila arizonae (Fly) – (GCA_001654025.1) 

Drosophila biarmipes (Fly) – (GCA_025231255.1) 

Drosophila bipectinata (Fly) – (GCA_000236285.2) 

Drosophila busckii (Fly) – (GCA_011750605.1) 

Drosophila elegans (Fly) – (GCA_000224195.2) 

Drosophila eugracilis (Fly) – (GCA_018153835.1) 

Drosophila ficusphila (Fly) – (GCA_018152265.1) 

Drosophila guanche (Fly) – (GCA_900245975.1) 

Drosophila gunungcola (Fly) – (GCA_025200985.1)

Drosophila hydei (Fly) – (GCA_003285905.2) 

Drosophila innubila (Fly) – (GCA_004354385.1) 

Drosophila kikkawai (Fly) – (GCA_018152535.1) 

Drosophila mauritiana (Fly) – (GCA_004382145.1) 

Drosophila miranda (Fly) – (GCA_003369915.2) 

Drosophila navojoa (Fly) – (GCA_001654015.2) 

Drosophila obscura (Fly) – (GCA_018151105.1) 

Drosophila rhopaloa (Fly) – (GCA_000236305.2) 

Drosophila santomea (Fly) – (GCA_016746245.2) 

Drosophila subobscura (Fly) – (GCA_008121235.1) 

Drosophila subpulchrella (Fly) – (GCA_014743375.2) 

Drosophila suzukii (Fly) – (GCA_013340165.1)

Drosophila takahashii (Fly) – (GCA_018152695.1) 

Drosophila teissieri (Fly) – (GCA_016746235.2) 

Eriocheir sinensis (Chinese mitten crab) – (GCA_024679095.1) 

Halyomorpha halys (Brown marmorated stink bug) – (GCA_000696795.2) 

Homarus gammarus (European lobster) – (GCA_958450375.1) 

Hydractinia symbiolongicarpus (Colonial hydrozoan cnidarians) – (GCA_029227915.2) 

Lytechinus pictus (Painted urchin) – (GCA_015342785.2) – 

Machimus atricapillus (Robber fly) – (GCA_933228815.1)  

Melanaphis sacchari (Aphid) – (GCA_002803265.2) 

Microctonus aethiopoides (Wasp) – (GCA_030272655.1)  

Microctonus aethiopoides (Wasp) – (GCA_030272935.1)  

Microctonus aethiopoides (Wasp) – (GCA_030347275.1)  

Microctonus hyperodae (Wasp) – (GCA_030347285.1) 

Myopa tessellatipennis (Thick headed fly) – (GCA_943737955.1) 

Octopus bimaculoides (California two-spot octopus) – (GCA_001194135.2) 

Paramacrobiotus metropolitanus – (GCA_019649055.1) 

Pecten maximus (Great scallop) – (GCA_902652985.1) 

Tribolium madens (Flour beetle) – (GCA_015345945.1) 

Uloborus diversus (Spider: cribellate orb weave) – (GCA_026930045.1) 

Updated genomes:

Drosophila ananassae (Fly) – (GCA_017639315.2) 

Drosophila erecta (Fly) – (GCA_003286155.2)

Drosophila grimshawi (Fly) – (GCA_018153295.1)

Drosophila mojavensis (Fly) – (GCA_018153725.1) 

Drosophila persimilis (Fly) – (GCA_003286085.2) 

Drosophila pseudoobscura (Fly) – (GCA_009870125.2) 

Drosophila sechellia (Fly) – (GCA_004382195.2)

Drosophila simulans (Fly) – (GCA_016746395.2) 

Drosophila virilis (Fly) – (GCA_003285735.2) 

Drosophila willistoni (Fly) – (GCA_018902025.2) 

Drosophila yakuba (Fly) – (GCA_016746365.2) 

The following outdated genomes will be removed:

Daphnia pulex (Water flea) – (GCA_000187875.1)

Hydra vulgaris (Fresh-water polyp) – (GCA_000004095.1)

Octopus bimaculoides (California two-spot octopus) – (GCA_001194135.1)

Rhipicephalus sanguineus (Brown dog tick) – (GCA_013339695.1) We will retain the V2 assembly version (GCA_013339695.2) 

Other updates and changes

  • Population frequency data will be available for more species in the Ensembl VEP web tool including chicken, dog, goat and sheep. 
  • A new Ensembl VEP option will predict the molecular consequence variants on human GRCh38 on open reading frames found in long noncoding RNAs (lncRNAs) and untranslated regions (UTRs) of protein-coding genes, as described in Mudge et al.
  • The Ensembl VEP web and REST interfaces will be updated to use the dbNSFP commercial data release.
  • New GENCODE Primary tag: The GENCODE Basic tag is losing its utility as it contains all full-length GENCODE transcripts and all new transcripts added using long transcriptomic data will be full-length. GENCODE Primary uses a new pipeline to identify a transcript set that contains transcripts that have novel features with high functional potential when compared to the MANE Select or Ensembl Canonical. The tag is available in files currently.