Ensembl 88 is now live. Read on to find out what’s new, and join us for a tour during our release webinar, at 16.00 BST on Tuesday, April 4.
Ensembl 88 updates our gene annotations, variation databases, comparative genomics analyses and web interface. Highlights include:
Updated assemblies, gene sets and annotations:
e!88 includes the most recent human reference genome assembly version, GRCh38.p10. In addition, the Ensembl-Havana GENCODE gene set (release 26) has been updated to include all CCDS genes, and cDNA alignments and RefSeq mappings have been updated.
The mouse Ensembl-Havana GENCODE gene set now includes all CCDS genes, and mouse cDNA alignments and RefSeq mappings have been updated. The rat gene set has also been updated, to include the latest manual annotations from Havana, and the name of the pika assembly has been altered to OchPri2.0-Ens to reflect differences between the Ensembl-annotated version and OchPri2.0.
New and updated REST endpoints:
A few new endpoints have been added to our REST API, including a phenotype endpoint that will return genes, variants and QTLs with phenotype associations in a defined genomic region; and comparative genomics endpoints to return family and phylogenetic tree data.
Existing endpoints have also been updated; our LD/ID and LD/region endpoints now require the addition of a population name, and the eQTL REST endpoints will access data from GTEx v6.
New variation and phenotype data:
Our variant and phenotype databases have been brought up-to-date by the addition of new data from several sources:
- dbSNP: dbSNP data have been updated to version 149 for human, rat, platypus and opossum.
- Structural variation: new data have been imported from DGVa, and existing data have been updated, for human, pig and sheep.
- COSMIC: COSMIC data have been updated to version 79 for human.
- PolyPhen: new pathogenicity predictions are available for human, from PolyPhen 2.2.2r405c.
New comparative genomics analyses:
- Protein families: HMM-based protein family predictions have been updated for all species. For human, transcript isoforms encoded by non-reference haplotype sequence have been included for the first time.
- Phylogenetic trees: Protein and ncRNA trees have been updated for all species, followed by updates to homology predictions.
- lincRNA models have been included for mouse lemur.
- BioMart Genes, Variation, Regulation, Vega and Mouse Genes databases have been updated; BioMart Genes, Variation, Regulation and Vega databases have been updated on the Ensembl GRCh37 site.
Our release webinar is scheduled to take place on 4 April 2017, at 16.00 BST. Join us, and ask your own questions, by registering at this link.