Ensembl 104 and Ensembl Genomes 51 are out! This release features updates to human and mouse genes, GRCh37 variation and regulation, new assemblies and variation for vertebrates, new plant species and a large update of the available metazoa data. We also said bye-bye to clone-based gene names and welcomed the new Ensembl Canonical transcripts.

Continue reading

Upcoming Ensembl release 104 will bring an update to the Ensembl Canonical transcripts. The new Ensembl Canonical definition will prioritise well-supported biologically representative, highly expressed and highly conserved transcripts. MANE Select will be used as the canonical transcript for human protein coding genes where available.

Continue reading

NCBI and EBI have been hard at work on our joint MANE collaboration, aiming to provide a set of representative transcripts for human protein-coding genes that are identically annotated in the NCBI RefSeq and Ensembl/GENCODE annotation sets and exactly match the GRCh38 reference assembly. We released MANE v0.5 in Dec 2018, which included one well-supported MANE Select transcript for 53% of the human protein-coding genes. The remainder has required a lot more analysis and curation than we expected, but we’re pleased to announce MANE v0.92, now covering 16,865 genes or ~88% of known protein-coding genes. We’ve been focussing on clinically relevant genes and MANE Select now includes 99% of genes with high gene-disease validity. This release also includes 43 extra transcripts labeled “MANE Plus Clinical” that have been chosen to aid in clinical reporting, for example when there are additional pathogenic or likely pathogenic variants not covered in the MANE Select transcript. For example in genes where there are mutually exclusive exons and both exons have clinically relevant variants, a MANE Plus Clinical transcript will be added alongside the MANE Select transcript so that both exons are represented in MANE.

An image illustrating the mutually exclusive exons in SCN5A.
The gene SCN5A, a sodium voltage-gated channel known to be involved in a number of disorders, illustrates the need for the MANE Plus Clinical set. This gene produces multiple alternatively spliced transcripts that contain mutually exclusive exons. Since clinically relevant variants have been mapped to both exons, it is not possible to report all known pathogenic variants associated with this gene using a single transcript.

While it’s critical to consider other alternatively-spliced transcripts for variant interpretation or functional analyses, the MANE Select and MANE Plus Clinical transcripts provide a common foundation for clinical reporting, and other analyses that benefit from using just one well-supported transcript or protein per gene.

MANE Select is now shown in the genome aggregation database gnomAD v3, is displayed and used as the preferred transcript for variant reporting in ClinVar and is displayed in DECIPHER. We have released this data as a trackhub for display in the Ensembl, NCBI and UCSC genome browsers.  MANE Select v0.92 transcripts will be available in Ensembl release 103 due in the Spring 2021, and will be included in BioMart and VEP.

Partnership with the community is really important to us. We value your feedback. If you are interested in working with us, please contact us at MANE-help@ebi.ac.uk.

The RefSeq column on our gene pages has changed.

We’re moving towards a more unified gene-set with RefSeq, with biologically important transcripts being highlighted as MANE. This means displays you’re used to seeing will be updated to reflect these changes, and this may affect the way you have been working with Ensembl.

Continue reading

Joannella Morales, Jane Loveland and Adam Frankish contributed to this post.

Back in October, we introduced you to our new joint initiative with the NCBI — the Matched Annotation from the NCBI and EMBL-EBI (MANE) transcript set. We are now pleased to update you on our progress so far.

The goal of this project is to share annotation and converge on a high-confidence, genome-wide transcript set, with a matched transcript in both RefSeq and Ensembl/GENCODE. We are doing this in two phases. During phase 1, we will release the “MANE Select” transcript set to include one well-supported transcript for every protein-coding locus. We envision the adoption of the MANE Select set as a default set across genomics resources. In phase 2, we intend to release an expanded set (“MANE Plus”) to include additional transcripts per locus that are well-supported or of particular user interest.

Continue reading