NCBI and EBI have been hard at work on our joint MANE collaboration, aiming to provide a set of representative transcripts for human protein-coding genes that are identically annotated in the NCBI RefSeq and Ensembl/GENCODE annotation sets and exactly match the GRCh38 reference assembly. We released MANE v0.5 in Dec 2018, which included one well-supported MANE Select transcript for 53% of the human protein-coding genes. The remainder has required a lot more analysis and curation than we expected, but we’re pleased to announce MANE v0.92, now covering 16,865 genes or ~88% of known protein-coding genes. We’ve been focussing on clinically relevant genes and MANE Select now includes 99% of genes with high gene-disease validity. This release also includes 43 extra transcripts labeled “MANE Plus Clinical” that have been chosen to aid in clinical reporting, for example when there are additional pathogenic or likely pathogenic variants not covered in the MANE Select transcript. For example in genes where there are mutually exclusive exons and both exons have clinically relevant variants, a MANE Plus Clinical transcript will be added alongside the MANE Select transcript so that both exons are represented in MANE.

The gene SCN5A, a sodium voltage-gated channel known to be involved in a number of disorders, illustrates the need for the MANE Plus Clinical set. This gene produces multiple alternatively spliced transcripts that contain mutually exclusive exons. Since clinically relevant variants have been mapped to both exons, it is not possible to report all known pathogenic variants associated with this gene using a single transcript.

While it’s critical to consider other alternatively-spliced transcripts for variant interpretation or functional analyses, the MANE Select and MANE Plus Clinical transcripts provide a common foundation for clinical reporting, and other analyses that benefit from using just one well-supported transcript or protein per gene.

MANE Select is now shown in the genome aggregation database gnomAD v3, is displayed and used as the preferred transcript for variant reporting in ClinVar and is displayed in DECIPHER. We have released this data as a trackhub for display in the Ensembl, NCBI and UCSC genome browsers.  MANE Select v0.92 transcripts will be available in Ensembl release 103 due in the Spring 2021, and will be included in BioMart and VEP.

The RefSeq column on our gene pages has changed.

We’re moving towards a more unified gene-set with RefSeq, with biologically important transcripts being highlighted as MANE. This means displays you’re used to seeing will be updated to reflect these changes, and this may affect the way you have been working with Ensembl.

Back in October, we introduced you to our new joint initiative with the NCBI — the Matched Annotation from the NCBI and EMBL-EBI (MANE) transcript set. We are now pleased to update you on our progress so far.

The goal of this project is to share annotation and converge on a high-confidence, genome-wide transcript set, with a matched transcript in both RefSeq and Ensembl/GENCODE. We are doing this in two phases. During phase 1, we will release the “MANE Select” transcript set to include one well-supported transcript for every protein-coding locus. We envision the adoption of the MANE Select set as a default set across genomics resources. In phase 2, we intend to release an expanded set (“MANE Plus”) to include additional transcripts per locus that are well-supported or of particular user interest.

We are pleased to introduce the Matched Annotation from the NCBI and EMBL-EBI (MANE) project. This new joint initiative between EMBL-EBI’s Ensembl project and NCBI’s RefSeq project aims to release a genome-wide transcript set that contains one well-supported transcript per protein-coding locus. All transcripts in the MANE set will perfectly align to GRCh38 and will represent 100% identity (5’UTR, coding sequence, 3’UTR) between the RefSeq (NM) and corresponding Ensembl (ENST) transcript.
