Ensembl 115 and Ensembl Genomes 62 are expected in September 2025. Check out what we’re up to, although we can’t guarantee everything listed here will make it into the final release.
Regulation
Retire least used microarray species in 115
We plan to retire the following funcgen databases in 115:
Vertebrates:
- Anas platyrhynchos platyrhynchos
- Aotus nancymaae
- Callithrix jacchus
- Carlito syrichta
- Cavia porcellus
- Cercocebus atys
- Colobus angolensis palliatus
- Cricetulus griseus chok1gshd
- Cricetulus griseus crigri
- Cyprinodon variegatus
- Fundulus heteroclitus
- Ictalurus punctatus
- Mandrillus leucophaeus
- Mesocricetus auratus
- Microcebus murinus
- Mus spretus
- Nannospalax galili
- Nomascus leucogenys
- Ornithorhynchus anatinus
- Papio anubis
- Piliocolobus tephrosceles
- Prolemur simus
- Propithecus coquereli
- Rhinopithecus bieti
- Rhinopithecus roxellana
- Saimiri boliviensis boliviensis
- Theropithecus gelada
Plants:
- Aegilops tauschii
- Arabidopsis halleri
- Arabidopsis thaliana
- Brassica napus
- Brassica oleracea
- Brassica rapa
- Glycine max
- Hordeum vulgare
- Nicotiana attenuata
- Oryza barthii
- Oryza glaberrima
- Oryza glumipatula
- Oryza indica
- Oryza longistaminata
- Oryza meridionalis
- Oryza nivara
- Oryza punctata
- Oryza rufipogon
- Phaseolus vulgaris
- Solanum lycopersicum
- Triticum aestivum
- Triticum dicoccoides
- Vigna angularis
- Vigna radiata
- Zea mays
Human
Around 121,000 new protein-coding transcripts will be added to the GRCh38 human reference gene set based on long-read RNA-seq data using the TAGENE pipeline.
New Assemblies and/or Annotation
Livestock and Companion Animals:
We will add 2 new breeds of cattle:
- UOA_Tuli_1 (GCA_040285425.1)
UOA_Wagyu_1 (GCA_040286185.1)
Update to sheep reference assembly/annotation:
The sheep reference ARS-UI_Ramb_v2.0 (GCA_016772045.1) will be updated to ARS-UI_Ramb_v3.0 (GCA_016772045.2)
Plants:
Additional Rice 3K variation data will be added for Oryza sativa (Rice; GCA_001433935.1). The 3000 Rice Genome Project is an international effort to sequence the genomes of 3,024 rice varieties from 89 countries.
Triticum aestivum Next Generation (TaNG) variation data will be added for Triticum aestivum (Wheat; GCA_900519105.1). The TaNG array is derived from 204 elite wheat lines and 111 wheat landraces from the Watkins ‘Core Collection’.
New Genomes
New plant species for 115
- Avena atlantica (Oat; GCA_910589765.1)
- Avena eriantha (Oat; GCA_910589775.1)
- Avena insularis (Oat; GCA_910574615.1)
- Avena longiglumis (Oat; GCA_910589755.1)
- Lablab purpureus Highworth (Lablab bean; GCA_030347555.1)
- Pisum sativum JI2822 (Garden pea; GCA_964186695.1)
- Pisum sativum Zhongwan6 (Garden pea; GCA_024323335.2)
New Non-Cores plant species data for 115
- Oryza sativa 3k variation data (Rice; GCA_001433935.1)
- TaNG variation data (Wheat; GCA_900519105.1)
Compara
- We will deprecate Compara Perl API methods related to selective pressure statistics (e.g. dN/dS). Deprecated methods have not been scheduled for deletion.
- We will introduce two Newick export modes, which may be helpful when accessing gene trees with clashing stable IDs: “Genome and gene ID”, in which leaf names are composed of the genome name and gene stable ID of a gene, and “Genome and product ID”, where each leaf is the genome name and protein/ncRNA product stable ID.
Vertebrates:
- With the update to the sheep reference assembly/annotation, we will update the Pig breeds gene-tree, pig-breed LastZ alignments and mammals EPO
- Murinae EPO will be updated to add Mus musculus molossinus
Plants:
- Three new plant species will be added to the default Protein trees – Lablab purpureus Highworth (Lablab bean; GCA_030347555.1), Pisum sativum JI2822 (Garden pea; GCA_964186695.1) and Pisum sativum Zhongwan6 (Garden pea; GCA_024323335.2)
- Protein trees were computed for the Hordeum vulgare pangenome, including 75 barley cultivars and relatives. The barley, rye and wheat reference genomes are also present in the Wheat cultivar protein trees. Barley cultivar gene trees may be accessed through barley genes, while wheat cultivar gene trees may be accessed via wheat and rye genes.
Metazoa:
- Insects Protein trees were updated
- We will update the 46 Pangenome Drosophila Cactus and Pangenome Drosophila protein trees with 6 new genomes
Metazoa:
New assembly on existing species (assembly and annotation)
- Amyelois transitella (Moths, GCA_032362555.1)
- Bactrocera dorsalis (Oriental fruit fly, GCA_023373825.1)
- Bicyclus anynana (Squinting bush brown, GCA_947172395.1)
- Branchiostoma lanceolatum (Amphioxus, GCA_035083965.1)
- Caenorhabditis remanei (Nematode, GCA_010183535.1)
- Danaus plexippus (monarch butterfly, GCA_018135715.1)
- Dendroctonus ponderosae (Mountain pine beetle, GCA_020466585.2)
- Drosophila bipectinata (Pomace flies, GCA_030179905.2)
- Drosophila elegans (Pomace flies, GCA_018152505.1)
- Drosophila kikkawai (Pomace flies, GCA_030179895.2)
- Drosophila suzukii (Pomace flies, GCA_037355615.1)
- Drosophila takahashii (Pomace flies, GCA_030179915.2)
- Drosophila virilis (Pomace flies, GCA_030788295.1)
- Helicoverpa armigera (Cotton bollworm, GCA_030705265.1)
- Hydra vulgaris (Swiftwater hydra, GCA_038396675.1)
- Linepithema humile (Argentine ant, GCA_040581485.1)
- Lytechinus pictus (Painted urchin, GCA_037042905.1)
- Mercenaria mercenaria (Northern quahog, GCA_021730395.1)
- Musca domestica (House fly, GCA_030504385.2)
- Necator americanus (New World hookworm, GCA_031761385.1)
- Nematostella vectensis (Starlet sea anemone, GCA_932526225.1)
- Ostrea edulis (Mud oyster, GCA_947568905.1)
- Sarcoptes scabiei (Itch mite, GCA_020844145.1)
- Stomoxys calcitrans (Stable fly, GCA_963082655.1)
- Tribolium castaneum (Red flour beetle, GCA_031307605.1)
Updated assemblies
- Eufriesea mexicana (Mexican orchid bee, GCA_001483705.1 -> GCA_001483705.2)
- Myopa tessellatipennis (Flies, GCA_943737955.1 -> GCA_943737955.2)
Updated annotations
- Stylophora pistillata (GCF_002571385.2)
Entirely new species (assembly and annotation)
- Amblyomma americanum (Lone Star tick, GCA_030143305.2)
- Bactrocera oleae (Olive fruit fly, GCA_001188975.4)
- Bradysia coprophila (Black fungus gnats, GCA_014529535.1)
- Contarinia nasturtii (Swede midge, GCA_009176525.2)
- Drosophila montana (Pomace flies, GCA_035044405.1)
- Drosophila nasuta (Pomace flies, GCA_023558535.2)
- Drosophila novamexicana (Pomace flies, GCA_003285875.3)
- Drosophila serrata (Pomace flies, GCA_002093755.2)
- Drosophila sulfurigaster albostrigata (Flies, GCA_023558435.2)
- Drosophila tropicalis (Pomace flies, GCA_018151085.1)
- Lucilia sericata (Common green bottle fly, GCA_015586225.1)
- Ornithodoros turicata (Softbacked ticks, GCA_037126465.1)
- Photinus pyralis (Common eastern firefly, GCA_008802855.1)
- Schmidtea mediterranea (Freshwater planarian, GCA_045838255.1)
- Schmidtea mediterranea (Freshwater planarian, GCA_045838265.1)
- Schmidtea nova (Freshwater planarian, GCA_044892505.1)
- Schmidtea polychroa (Freshwater planarian, GCA_044892525.1)
- Steinernema hermaphroditum (Nematode, GCA_030435675.2)
- Tenebrio molitor (Darkling ground beetles, GCA_907166875.3)
- Vespa mandarinia (Asian giant hornet, GCA_014083535.1)
Compara reference updates
- Amyelois transitella – Updated to GCA_032362555.1, replaces GCA_001186105.1
- Bactrocera dorsalis – Updated to GCA_023373825.1, replaces GCA_000789215.2
- Bicyclus anynana – Updated to GCA_947172395.1, replaces GCA_900239965.1
- Dendroctonus ponderosae – Updated to GCA_020466585.2, replaces GCA_000355655.1
- Drosophila bipectinata – Updated to GCA_030179905.2, replaces GCA_000236285.2
- Drosophila elegans – Updated to GCA_018152505.1, replaces GCA_000224195.2
- Drosophila kikkawai – Updated to GCA_030179895.2, replaces GCA_018152535.1
- Drosophila suzukii – Updated to GCA_037355615.1, replaces GCA_013340165.1
- Drosophila takahashii – Updated to GCA_030179915.2, replaces GCA_018152695.1
- Drosophila virilis – Updated to GCA_030788295.1, replaces GCA_003285735.2
- Helicoverpa armigera – Updated to GCA_030705265.1, replaces GCA_023701775.1
- Linepithema humile – Updated to GCA_040581485.1, replaces GCA_000217595.1
- Musca domestica – Updated to GCA_030504385.2, replaces GCA_000371365.1
- Stomoxys calcitrans – Updated to GCA_963082655.1, replaces GCA_001015335.1
- Tribolium castaneum – Updated to GCA_031307605.1, replaces GCA_000002335.3
Variation updated
- Anopheles gambiae (GCA_000005575.1) – Fixes known bug reported in Release 61
The following species cores are outdated and will be dropped from 115 (EG62):
Dropped but not inc into Compara analysis:
- Galleria mellonella (GCA_003640425.2)
- Hyalomma asiaticum (GCA_013339685.1)
- Ixodes persulcatus (GCA_013358835.1)
Dropped, including from Compara analysis:
- Bombyx mori (GCA_014905235.2)
- Crassostrea gigas (GCA_902806645.1)
- Culex quinquefasciatus (GCA_000209185.1) – core, variation and other features
- Diabrotica virgifera (GCA_003013835.2)
- Drosophila ananassae (GCA_000005115.1)
- Drosophila erecta (Drosophila erecta)
- Drosophila grimshawi (GCA_000005155.1)
- Drosophila mojavensis (GCA_000005175.1)
- Drosophila persimilis (GCA_000005195.1)
- Drosophila sechellia (GCA_000005215.1)
- Drosophila simulans (GCA_000754195.3)
- Drosophila willistoni (GCA_000005925.1)
- Ixodes scapularis (GCA_000208615.1) – core and variation
- Biomphalaria glabrata (GCA_000457365.1) – core and variation
- Amyelois transitella (GCA_001186105.1) (clade – insects)
- Bactrocera dorsalis (GCA_000789215.2) (clade – insects)
- Bicyclus anynana (GCA_900239965.1) (clade – insects)
- Dendroctonus ponderosae (GCA_000355655.1) (clade – insects)
- Drosophila bipectinata (GCA_000236285.2) (Drosophila pangenome)
- Drosophila elegans (GCA_000224195.2) (Drosophila pangenome)
- Drosophila kikkawai (GCA_018152535.1) (Drosophila pangenome)
- Drosophila suzukii (GCA_013340165.1) (Drosophila pangenome)
- Drosophila takahashii (GCA_018152695.1) (Drosophila pangenome)
- Drosophila virilis (Drosophila pangenome)
- Drosophila virilis (GCA_003285735.2) (Drosophila pangenome)
- Helicoverpa armigera (GCA_023701775.1) (clade – insects)
- Linepithema humile (GCA_000217595.1) (clade – insects)
- Musca domestica (GCA_000371365.1) (clade – insects)
- Myopa tessellatipennis (GCA_943737955.1) (Drosophila pangenome)
- Stomoxys calcitrans (GCA_001015335.1) (clade – insects)
- Tribolium castaneum (GCA_000002335.3) (clade – insects)
The following databases are outdated and will be dropped in 115 (EG62):
Other features databases dropped in E115 (EG62):
Culex quinquefasciatus otherfeatures 61_114 _3 (GCA_000209185.1)
Variation databases for the following species will be dropped in 115 (EG62):
- Culex quinquefasciatus variation 61_114_3 (GCA_000209185.1)
- Ixodess capularis variation 61 _114_1 (GCA_000208615.1)
- Biomphalaria glabrata variation 61_114_1 (GCA_000457365.1)
Variation:
Updating ClinVar Import and Ensembl Variant Effect Predictor (VEP) handling
ClinVar have updated the way clinical significance is represented. Now, three types of variant classifications are available. The current clinical significance reported by Ensembl VEP remains the same, with the addition of one new type of data: somatic classifications from ClinVar.
As a consequence, ClinVar have updated their data schema. To accommodate the new data, the Ensembl Variation import script will be adapted.
Specific changes are:
Variation API updates
– We will have a new method to return the new somatic classifications from ClinVar.
Ensembl VEP updates
– We will update VEP with new option to return new ClinVar somatic classification: –clinvar_somatic_classification
Ensembl VEP pipeline update
– Update the VEP dump pipeline to include the new somatic classification
Web update
– The variation phenotype page is going to include a new table to display the new somatic classifications.
GENCODE Promoter support
Ensembl VEP now supports GENCODE promoters through the –custom and gff_type=gencode_promoter option on the command line, or by selecting the “Report overlap with GENCODE Promoters:” option in web Ensembl VEP.
Supporting Structural Variant Allele Frequencies and Clinical Significance
Web Ensembl VEP will have two new options to enable reporting of structural variant allele frequency (from gnomAD) and clinical significance (from ClinVar). Both options have a range of selectable overlap percentages, up to requiring perfect match.
New Ensembl VEP Plugin – available for CLI
MechPredict – This is a plugin for the Ensembl Variant Effect Predictor (VEP) that annotates missense variants with predicted dominant-negative (DN), gain-of-function (GOF), or loss-of-function (LOF) mechanisms derived from a Support Vector Classification (SVC) model (Badonyi et al., 2024).
Automation
New FTP Paths available
New FTP paths are available for data access:
Other updates and changes
- The Ensembl 99 archive (Jan 2020) and the Ensembl Genomes 45 (Sep 2019) archives are five years old and will be retired with the release of Ensembl 115 and Ensembl Genomes 62.
- The Ensembl Virtual Machine will no longer be available due to low demand.
