What’s New in e87:

Updated assemblies, gene sets and annotations

In Ensembl 87, there are a number of updates to the assemblies and gene sets for several species:

  • Human: updated cDNA alignments and RefSeq import
  • Mouse: updated gene set and assembly, see below
  • Zebrafish: updated gene set
  • Chicken: updated gene set

Updated gene models for mouse olfactory receptors

e87 includes an updated Ensembl-Havana mouse gene set, a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

This latest Havana gene annotation includes improved gene models for the mouse olfactory receptors. Over 2Mbp of additional sequence has been added to the mouse olfactory genes to create several hundred multi-exonic models. These new models are based on RNA-seq data from Ibarra-Soria X et. al.

The mouse assembly has been updated to GRCm38.p5. The patches for GRCm38.p5 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence.

New lincRNA data

New regulation summary activity table

Due to the high number of epigenomes now available in the Human Regulatory Build, we can no longer show them all by default on the Regulation Summary image, in the Regulation Tab. We have therefore added a table listing the cell types by their regulatory feature activity.

Regulation Cell Type Activity table

Other News

  • DGVa structural variant study updates for Human, Cow and Macaque
  • dbSNP updates for Sheep
  • Cosmic version 78 imported for human
  • Phenotype data updates for several species

A complete list of the changes can be found on the Ensembl website

Find out more about the new release and ask the team questions, in our free webinar: Wednesday 14th December, 4pm GMT. Register here.

Due to essential maintenance, the Ensembl helpdesk email will be shut down for approximately 48 hours, beginning at 9 am (GMT) on 25th November. Any emails sent during this time will be held in a queue, and we will respond to them when the system is up and running again, although there may be some delay. You will not receive any confirmation of your email, as this is automatically generated by the system.

You can also post queries to the Ensembl dev list and BioStars (please add “Ensembl” as a tag).

We will update you when the system is back.

We apologise for any inconvenience this may cause.

What’s new?

Ensembl Plants takes centre stage in the release of Ensembl Genomes 33, with a variety of new data available for a number of different species:

  • Incorporation of the Araport 11 gene model annotation for Arabidopsis thaliana
  • Addition of mitochondrial and plastid genome sequences to the current maize (Zea mays) chromosomal assembly (AGPv4)
  • Alignment between the A, B and D genomes of bread wheat (Triticum aestivum) updated to use TGACv1 genome assemblies
  • Whole genome alignment between bread wheat and Brachypodium distachyon

Other News

You can find more details in the release notes.

UPDATE: Postponed until further notice

We are postponing this maintenance until the beginning of November for technical reasons. Specific dates will be released as they are finalised.

 

Due to essential maintenance, the Ensembl helpdesk email will be shut down for approximately 48 hours, beginning at 9 am (BST) on 24th October. Any emails sent during this time will be held in a queue, and we will respond to them when the system is up and running again, although there may be some delay. You will not receive any confirmation of your email, as this is automatically generated by the system.

You can also post queries to the Ensembl dev list and BioStars (please add “Ensembl” as a tag).

We will update you when the system is back.

We apologise for any inconvenience this may cause.

What’s New in e86:

Mouse strain genomes

In Ensembl 86, you will now be able to view the annotated genome assemblies, variation data and comparative analyses of 16 different mouse strains, produced by the Mouse Genomes Project. While the GRCm38 assembly (produced from Mus musculus strain C57BL/6J) remains the reference assembly, variants and comparative analyses for the other strains can be viewed through the Gene tab and the Location tab. You can find the gene trees and orthologue/paralogue predictions for the mouse strains through the Strains option in the menu in the Gene tab. The mouse strain gene tree depicts the evolutionary history of genes (left) and protein alignment (right) for the individual mouse strains and rat. mouse strain treemouse strain orthologues You can find the variants between these mouse strains through the Strain table option in the menu in the Location tab. The strain table displays the alleles identified at variant positions across the 16 mouse strains. strain variant table

Updated assemblies, gene sets and annotations

In Ensembl 86, there will also be a number of updates to the assemblies and gene sets for a number of different species:

  • Human: updated cDNA alignments and RefSeq import
  • Mouse: updated cDNA alignments and RefSeq import
  • Zebrafish: updated gene set and RefSeq import
  • Chicken: updated to the Galgal_5.0 assembly
  • Mouse lemur: updated to the Mmur_2.0 assembly
  • Macaque:  updated to the Mmul_8.0.1 assembly

New lincRNA data

New Mobile Site Views

As of release 86, you can now view transcripts on the mobile version of Ensembl. You can also view exon sequence, cDNA sequence and protein sequence by clicking on the lefthand arrow.

mobile site- transcript[1]mobile site- transcript[2]

The gene sequence is also now available to view on mobile devices. Just go to any gene page and click on the left hand arrow and then choose sequence.

1

Other News

  • Variation and phenotype databases updated
  • You can now select ‘Manhattan plot’ as an option when configuring bigWig files

A complete list of the changes can be found on the Ensembl website

Find out more about the new release and ask the team questions, in our free webinar: Tuesday 11th October, 4pm BST. Register here.

Ensembl 86 is scheduled for September 2016, highlights include:

New mouse strains

  • Annotated genome assemblies, variation data and comparative analyses of 16 different mouse strains will be included in Ensembl 86.

Updated assemblies, gene sets and annotations

  • Human: updated cDNA alignments and RefSeq import
  • Mouse: updated cDNA alignments and RefSeq import
  • Zebrafish: updated gene set and RefSeq import
  • Chicken: updating to the Galgal_5.0 assembly
  • Mouse lemur: updating to the Mmur_2.0 assembly
  • Macaque:  updating to the Mmul_8.0.1 assembly

New lincRNA data

New GRCh37 tools converted from 1000 Genomes Project

A number of tools previously developed for use in the 1000 Genomes Project browser have now been converted for use with the GRCh37 assembly in Ensembl:

  • Dataslicer tool- This tool allows you to get a subset of data from a BAM or VCF file.
  • Variation pattern finder tool- This tool allows you to identify variation patterns in a chromosomal region of interest for different individuals.
  • Forge analysis tool- This tool takes a list of variants and analyses their enrichment in functional regions from the ENCODE or Roadmap Epigenome project on a tissue specific basis.

Other updates and highlights

  • Variation and phenotype databases updates

For more details on the declared intentions, please visit our Ensembl admin site. Please note that these are intentions and are not guaranteed to make it into the release.

As part of Ensembl 85, we are excited to introduce expression quantitative trait loci (eQTL) data, through our partnership with the Genotype-Tissue Expression (GTEx) project.

The GTEx project has the goal of identifying the influence of genetics on tissue-specific gene expression, i.e. to map correlations between genotype (SNPs) and gene expression levels (RNA-seq). eQTLs are variants which are found to be significantly correlated with differences in gene expression. Though still in its infancy, we hope that in time this type of data will allow us to conclusively determine the link between regulatory features and their gene targets.

Thanks to our use of HDF5 technology, we offer the only rapid look-up service across all GTEx SNP-gene association tests. We have included all of the correlated variants, including those that fall short of the significance threshold. The GTEx V6 dataset represents 7051 tissue samples from 44 tissues of 449 donors, and a total of 6 billion data points.

 

GTEx eQTLs in the Ensembl Browser

To view GTEx eQTL data for any gene, navigate to the gene tab and select ‘regulation’ in the left panel. The display will show one example track of GTEx data for a single tissue. Configuring the page allows you to add more GTEx tracks for each tissue type, by selecting ‘other regulatory regions’ and choosing the tissues you are interested in:

Screenshot 2016-08-12 15.20.33

The SNPs are displayed in a Manhattan plot on these tracks, and are coloured according to their consequences on the transcript – as determined by the VEP. Clicking on any of the variants will display correlation statistics and a link to the variant tab. Where the SNPs are clustered, clicking will bring up a list of all variants nearby:

Screenshot 2016-08-12 11.56.19

 

GTEx eQTLs via REST API

We have also provided Ensembl REST API endpoints to access these data. Currently, these methods allow you to quickly find the beta correlations and their p-values filtered by gene, SNP and/or tissue. You can also list all the tissue types that are currently available on our server.

Screenshot 2016-08-10 10.54.46

 

What’s next?

Currently we are displaying the variants around a gene and their correlation to its expression level. In our next release (e86), on the Variant view, we will display all the genes whose expression levels are correlated to that variant. We will also display the beta effect sizes on the Manhattan plots.
If you have any feedback or questions relating to eQTLs in Ensembl, please contact the helpdesk.

What’s new?

Ensembl Plants now has an archive site, where we will keep selected previous releases of Ensembl Plants publicly available. The first release available on the archive site is release 31, and includes the previous assemblies for wheat and maize.

plant archive

New assemblies in Ensembl Plants include:

  • A new assembly of the bread wheat genome (TGACv1). The assembly has a scaffold N50 of 88 Kbp and a total length of 13.4 Gbp in contigs greater than 500 bp. Approximately 99,000 genes (99% of the total) annotated on the previous IWGSC Chromosome Survey Sequence Assembly have been mapped to the new assembly
  • An updated assembly of the Zea mays genome (AGPv4)
  • Genome assemblies for 5 new species, including Beta vulgaris (sugar beet), Brassica napus (rapeseed) and Trifolium pratense (red clover)

 

Ensembl Metazoa: Rfam covariance models have been applied to all metazoan genomes, and are shown in the ‘Rfam models’ track in the genome browser. Click on a model to see the description and the secondary structure.

rfam_model_example_1

Ensembl Bacteria now includes the latest versions of 41,610 genomes (41,198 bacteria and 412 archaea) from the INSDC archives. In this release we added 2269 new genomes, 15 genomes with updated assemblies, 212 genomes with updated annotation, 906 genomes where the assigned name has changed, and 243 genomes removed since the last release.

Ensembl Fungi has been updated with 47 newly available genomes and now includes 634 genomes from 388 species. PHI-base references have been added where available, as have non-coding RNA matches to Rfam.

25 new genomes have been added to Ensembl Protists, which now includes 178 genomes from 114 species.

You can find more details in the release notes.