Coming soon! MANE Select v0.5

Joannella Morales, Jane Loveland and Adam Frankish contributed to this post.

Back in October, we introduced you to our new joint initiative with the NCBI — the Matched Annotation from the NCBI and EMBL-EBI (MANE) transcript set. We are now pleased to update you on our progress so far.

The goal of this project is to share annotation and converge on a high-confidence, genome-wide transcript set, with a matched transcript in both RefSeq and Ensembl/GENCODE. We are doing this in two phases. During phase 1, we will release the “MANE Select” transcript set to include one well-supported transcript for every protein-coding locus. We envision the adoption of the MANE Select set as a default set across genomics resources. In phase 2, we intend to release an expanded set (“MANE Plus”) to include additional transcripts per locus that are well-supported or of particular user interest.

What have we accomplished?

Currently, we have defined MANE Select transcripts for 53% of human protein coding genes (v0.5). We have released this data on the NCBI’s FTP site and as a trackhub for display in the Ensembl, NCBI and UCSC genome browsers.  All transcripts in the MANE Select set will perfectly align to the GRCh38 reference assembly and represent 100% identity (transcription start site (TSS), 5’UTR, CDS, 3’UTR, 3’ end) between the Ensembl (ENST) transcript and the corresponding RefSeq (NM) transcript.

Coming soon! MANE Select v0.5 transcripts will be available in Ensembl release 96, due in the Spring.

MANE Select v0.5 trackhub displayed in Ensembl browser

What’s next?

We are now working towards genome wide coverage of the MANE Select set.

Transcripts in the set are identified using independent computational methods complemented by manual review and discussion. We utilise evidence of functional potential such as expression levels, evolutionary conservation, and clinical significance. We are currently improving our respective methods to increase convergence and expect to release a significantly extended set of transcripts  by the Autumn of 2019. We are also working on defining criteria for inclusion of transcripts in the MANE Plus set.

This figure displays the relationship between MANE Select and MANE Plus using a hypothetical example. The MANE Select transcript initiates from the strongest promoter (Promoter 1).  Expression data shows that it is the most representative at the locus. However, the additional transcript included in the MANE Plus set is well-supported, has an additional, conserved protein-coding exon and covers pathogenic variants not covered by the MANE Select.

Want to learn more?

Now available! Take a look at our recorded webinar available on YouTube to find out more details about our aims and methodology. EMBL-EBI and the NCBI jointly hosted the webinar and members from both institutes presented our work. Partnership with the community is really important to us. We value your feedback. If you are interested in working with us, please contact us at