What’s new in Ensembl Genomes 31?

There are legs and tentacles everywhere in this release of Ensembl Metazoa, as ten new species scuttle, swim and slither into our databases. From the Antarctic midge to the California two-spot octopus, the new species illustrate the diversity of metazoa. Our new Metazoan species also include dog and rat parasites (the itch mite and a nematode), as well as species that pose significant problems for agriculture (Australian sheep blowfly) and aquaculture (the salmon louse and a myxosporean). The common bumblebee is an important pollinator, a brachiopod represents a new phylum in Ensembl Metazoa, while the African social velvet spider is a fascinating model of sociality and is the first spider in Ensembl Genomes.

Belgica_antarcticaBombus_impatiensLingula_anatinaLucilia_cuprinaOctopus_bimaculoidesSarcoptes_scabieiStegodyphus_mimosarumStrongyloides_rattiLepeophtheirus_salmonisThelohanellus_kitauei

Not to be outdone, Ensembl Protists is now updated to 158 genomes from 104 species and Ensembl Bacteria has been updated to include the latest versions of 39,584 genomes (39,183 bacteria and 401 archaea) from the INSDC archives.

Other news

Fungi: Updated annotations based on PHI-base 4.0 have been included. New variation data for Schizosaccharomyces pombe.

Protists: Addition of 4 protist species for pan-taxonomic comparative analysis (Monosiga brevicollis, Thecamonas trahens, Cryptomonas paramecium and Chondrus crispus), meaning that Ensembl Compara now includes protists from all the major Eukaryotic clades.

Plants: There are now 350,000 new rice variations across 3,000 rice accessions from 89 different countries as well as track hubs for more than 900 public RNA-Seq studies, totalling more than 16,000 tracks across 35 different plant species.

MetazoaUpdated gene sets for the leaf cutter antred fire ant and the two-spotted spider mite as well as updated gene sets from VectorBase and WormBase.

Check out all the changes on our Ensembl Genomes website.

Any questions or comments? Email us.

What’s new in e84:

  • Human: Incorporation of BLUEPRINT Epigenome data and methylation data
  • Pairwise Linkage Disequilibrium (LD) calculation on LD variant page
  • Track hub registry interface
  • Transcript haplotype view

Incorporation of BLUEPRINT Epigenome data

BLUEPRINT is a large scale research project aimed at deciphering the epigenome of blood cells. ChIP-seq and DNase hypersensitivity data from the BLUEPRINT project has now been incorporated into Ensembl. All of the cell types analysed in the BLUEPRINT project are listed here. In Ensembl 84, we are including BLUEPRINT data for the following 20 independent cell types, divided based on cell lineage and tissue source:

CD14+ CD16- monocyte from Venous Blood
CD14+ CD16- monocyte from Cord Blood
CD4+ ab T cell from Venous Blood
CD8+ ab T cell from Cord Blood
CM CD4+ ab T cell from Venous Blood
eosinophil from Venous Blood
EPC from Venous Blood
erythroblast from Cord Blood
HUVEC prol from Cord Blood
M0 macrophage from Cord Blood
M0 macrophage from Venous Blood
M1 macrophage from Cord Blood
M1 macrophage from Venous Blood
M2 macrophage from Cord Blood
M2 macrophage from Venous Blood
MSC from Venous Blood
naive B cell from Venous Blood
neutro myelocyte from Bone Marrow
neutrophil from Cord Blood
neutrophil from Venous Blood

This data can be viewed alongside other tracks in Ensembl by using the ‘Configure this Page’ option and selecting your cells of interest.  configure this pageBLUEPRINTex2

Pairwise LD calculation

You are now able to calculate linkage disequilibrium (LD) between any two variants in Ensembl. To calculate the r2 and D’ values for LD between two specific variants, enter the ID of any variant into the LD calculation text box on the specific page of the reference variant. This feature can be found by clicking on ‘Linkage Disequilibrium’ from the menu on any variant page.

LDcalc2

Track Hub registry interface

With the arrival of the new Track Hub Registry, we have added a feature that allows you to search for track hubs of interest and attach them directly to Ensembl. Just click on the ‘Add your data/Manage your data’ button on any Ensembl page, and select ‘Track Hub Registry Search’ from the lefthand menu. manage your dataTrackHubRegistryInterface

The interface will only search for hubs that have assemblies available for the site you are on; to see the full range of species and assemblies, visit the Track Hub Registry site directly.

Transcript haplotype view

The transcript haplotype view is a new data view we have implemented that allows you to explore observed transcript sequences that results from variants identified from resequencing data from the 1000 Genomes Project. By clicking on the ‘Haplotypes’ link on any transcript page, you are able to view protein consequences, population frequencies and protein alignments of all the haplotypes for that particular transcript.

Transcript_haplotype_view Screen Shot 2016-03-02 at 11.01.34Screen Shot 2016-03-02 at 11.02.04

Other news

  • Mouse: update to GENCODE M9 annotation
  • Zebrafish: updated gene set, including manually annotated HAVANA annotation
  • Baboon: lincRNA model update
  • Latest sequence variants from dbSNP build 146 for human, cow and dog
  • Import of COSMIC 75 cancer data
  • New and updated studies from DGVa for several species such as human, mouse, zebrafish, macaque, cow and dog
  • Gene trees: new option to prune by target species/ taxon in the REST API
  • Ensembl Families now defined by an HMM library, based upon the Panther database.
  • Alignments in CRAM format
  • DAS support ended
  • Regulatory segments retired from the Ensembl regulation BioMart, but now available in bigbed format through the ftp site

A complete list of the changes can be found on the Ensembl website.

Find out more about the new release, and ask the team questions, in our free webinar. Wednesday 16th March, 4pm GMT. Register here.

Ensembl 84 is scheduled for March 2016 and includes:

Updated gene sets and annotations

  • Human: Incorporation of Blueprint epigenome data and methylation data
  • Mouse: update to GENCODE M9 annotation
  • Zebrafish: updated gene set, including manually annotated HAVANA annotation and RNAseq data update
  • Cow: ncRNA data update and transcriptomic data update
  • Baboon: lincRNA model update

Variation data imports and updates

  • Phenotype data updated for several species including human, mouse, rat, zebrafish and pig
  • Latest sequence variants from dbSNP build 146 for human, cow and dog
  • HGMD data update
  • Import of COSMIC 75 cancer data
  • New and updated studies from DGVa for several species such as human, mouse, zebrafish, macaque, cow and dog

Other highlights and data sets

  • Pairwise LD calculation on LD variant page
  • Alignments in CRAM format
  • Track hub registry interface
  • Gene trees: new option to prune by target species/ taxon
  • DAS support ended
  • Regulatory segments retired from the Ensembl regulation Biomart, but now available in bigbed format through the ftp site

For more details on the declared intentions, please visit our Ensembl admin site. Please note that these are intentions and are not guaranteed to make it into the release.

What is new?

  • Molecular and biological information from PHI-base (version 4.0) for thousands of genes in Protists, Bacteria and Fungi that are involved in pathogen-host interactions
  • Small non-coding RNA genes in the diatom Phaeodactylum tricornutum (Rogato et al. 2014)
Phaeodactylumtricornutum_1_1043340_1060257

Small non-coding RNA genes described by Rogato et al (2014) in the diatom P. tricornutum are now available in Ensembl Protists. Hover over the ncRNA genes track (genes coloured in light purple) for more information.

Other news

  • Improved image export functionality from our websites

Check our ‘New image export option in Ensembl‘ post for more details.

Slide1

The images in Ensembl Genomes browser websites can be exported in different formats and resolution. Choose the one that suits you best and click on ‘Download’.

  • Protein domains for Protists, Metazoa, Fungi and Plants recalculated with InterProScan (version 54.0)
  • Updated BioMart: Protists, Metazoa, Fungi and Plants
  • Updated gene trees in Plants

Check out all the changes on our Ensembl Genomes website.

Any questions or comments? Email us.

Would you like to include images from Ensembl during a presentation or in your paper or poster?

We are happy to announce that a new image export option is available in Ensembl 83, which optimises colour and contrast settings for presentation on a projector or in print. You can download images from Ensembl using the ‘Export this Image’ icon, at the top-left of every image. Below is the image download form, showing the new export options.

Image export page

Presentation options
Our new export feature for presentations alters the image to be clearly visible on projectors by:

  • saturating colours to improve contrast in brightly lit environments
  • increasing line breadth for viewing from a distance.

You can see the difference below. On the left is a ‘Standard Web’ exported image. On the right is the same exported image with the ‘Presentation’ feature.

Human_13_32315474_32400266 Human_13_32315474_32400266-2

Print options
If you’re looking for an image for your paper or poster, try our new print options, labelled ‘Journal/report’ and ‘Poster’. Images exported for print have a high resolution, which produce x2 and x5 enlargements respectively.

Other export options
You can also export the standard web image in PNG or PDF format for use on the web, or SVG format by clicking on the ‘Custom image’ export option.

information iconYou can find more information about exporting images by clicking on the information icons in the export menu.

We would love to hear from you if you have used the new image export options for your own work. Image parameters can be tweaked, so we welcome feedback on whether these features suit your needs. Leave your comments below or contact the Ensembl helpdesk.

What’s new in e83:

  • Human: gene set updated to GENCODE 24, and new assembly patches (GRCh38.p5)
  • Manhattan plot track for LD
  • Advanced Filtering and Counts on Variant table
  • Minor allele frequency (MAF) filter on sequence mark-up views

Human gene set update and new assembly patches

chromosome_exceptions

The human gene set now corresponds to GENCODE 24 while the assembly has been updated to include new assembly patches for GRCh38.p5.

Manhattan plot track for linkage disequilibrium

This new linkage disequilibrium (LD) track is focused on a variant and displays the linked variants surrounding the focus variant. The track displays a Manhattan plot, using the r2 and D prime values (from 0 to 1) on the Y axis. The new track is accessible in the Variation Linkage disequilibrium page, through the links in the new column “LD Manhattan plot“.

Manhattan_plot_track_LD

Advanced Filtering and Counts on Variant table

The functionality of the variant table has been further expanded to allow a wider range of filtering options. Filtering can now be applied by Minor Allele Frequency, SIFT and PolyPhen scores, Clinical Significance, Consequence Type and many other columns, using buttons along the top of the variant table. For many of these filters, preset useful combinations of options are available within the popup allowing rapid configuration of more complex combinations. In addition, row counts for each consequence type have been readded to the existing Consequence Type filter. These are displayed in the popup which appears once the filter button has been pressed.

Filtering variants in RYR1 gene (ENSG00000196218).

MAF filter on sequence mark-up views

The variants displayed on all sequence mark-up views can be filtered by minor allele frequency (MAF), allowing you to either show or hide according to a range of frequencies (between 0.01% and 10%). This filtering is not on by default so to enable it go to ‘Configure this page’ on any sequence view page and then choose the value you want from the ‘Hide variants by frequency (MAF)’ drop down menu.

Consequence filter for MAF.Improving the image export and Ensembl mobile website

There are many updates to these functionalities which will be described in detail in separate blog posts. Look out for these blogs! Below is how the new image export wizard looks like.

Image export window

Other news

  • Mouse: updated to GENCODE M8 annotation
  • Rat: updated gene set, including manually annotated HAVANA annotation
  • Annotations now available in RDF format for all species on our FTP site
  • New human phenotype association data from Cancer Gene Census
  • RefSeq genomic to mRNA comparison attributes will be updated for human
  • New dbSNP145 variation data for chicken and pig

A complete list of the changes can be found on the Ensembl website.

Find out more about the new release, and ask the team questions, in our free webinar. Wednesday 16th December, 4pm GMT. Register here.

Ensembl 83 is scheduled for December 2015 and includes:

Updated gene sets and annotations

  • Human: gene set updated to GENCODE 24, and new assembly patches (GRCh38.p5)
  • Mouse: updated to GENCODE M8 annotation
  • Rat: updated gene set, including manually annotated HAVANA annotation

Variation data imports and updates

  • Phenotype data updated for several species including human, mouse, rat, horse and turkey
  • New human phenotype association data from Cancer Gene Census
  • New and updated studies from DGVa for several species such as human, mouse, macaque, cow and dog
  • Latest sequence variants from dbSNP build 145 for chicken and pig
  • New “ExAC” evidence, variation set and track
  • dbSNP rsIDs now available for gibbon

Other highlights and data sets

  • RefSeq genomic to mRNA comparison attributes will be updated for human
  • Improved image export
  • GFF3 support for userdata
  • Annotations in RDF format for all species

 

For more details on the declared intentions, please visit our Ensembl admin site. Please note that these are intentions and are not guaranteed to make it into the release.

What is new?

  • Pairwise alignments for more than 50 plant genomes including potato and tomato, cacao and grape, and several Oryza sp
  • Newly available genomes from INSDC into our ever expanding Protists and Fungi divisions
  • Cross-references of genes in Fungi and Protists to the Pathogen – Host Interaction Database (PHI-base)
  • Additional 6,806 bacterial genomes imported from ENA
  • Protein domain information from InterProScan (version 5.14-53.0)

Genomic alignments

New pairwise alignments have been extended to additional plant genomes and can be viewed in our browser website. The alignment text can be downloaded in different file formats.

Comparing the genomic region of the Patatin gene in two plant species, potato and tomato.

Comparing the genomic region of the Patatin gene in two plant species, potato and tomato.

In addition to the graphical view, whole genome alignments can be retrieved via our FTP or programmatically using our APIs.

View the complete list of genomic alignments available in this new release.

Marking your favourite region, gene, exon, or variant

You can now mark a selected region when browsing the Ensembl Genomes websites. Drag and select your favourite gene and use the pop-up window to mark it. You can also click on the gene itself to mark the location of it.

Slide1

Highlighting a genomic region in Ensembl Genomes is now available in many of our views.

Other news

  • Updated BioMart: Protists, Metazoa, Fungi and Plants
  • Updated peptide comparative genomics
  • New assemblies and annotations for existing fungal species
  • Gene families available in Protists

A complete list of the changes can be found on the Ensembl Genomes website.

Any questions or comments? Get in touch.

The second update of the GRCh37 archive site has now been released. Some of the data imports and updates for this release include:

  • dbSNP144 human data including data from the Exome Aggregation Consortium (ExAC)
  • Public HGMD data (version 2015.2)
  • Phenotype data from NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet and Decipher
  • Exome Sequencing Project data (v.0.0.30. (Nov. 3, 2014))
  • HumanCoreExome-12 chip
  • Gene annotation dumps in GFF3 format

We have re-built the GRCh37 dedicated Ensembl, Regulation and Variation BioMarts to integrate the updated data sets.

You will find a complete list of the changes on the Ensembl GRCh37 website.

What’s new in e82:

Ensembl mobile website

We are very happy to announce the release of the Ensembl website mobile version, available on http://m.ensembl.org. This new website allows you to quickly search for a gene, variants or phenotype on your mobile device.

Ensembl mobile website

Support of VEP Plugins through the web interface, script and REST

The VEP can now be extended beyond its core functionality using a system of plugins. Plugins are a powerful way to extend, filter and manipulate the output of the VEP. More information regarding the VEP plugins can be found on the following documentation page.

Improved Variation tables

Variation tables for genes and transcripts have been reimplemented to effectively handle the large number of variants now known for many genes. At the same time, the ability to filter, sort, and select this data has been improved. Filtering by variant type is now achieved by selecting the “Type:” filter at the top of the main table. Further features and refinements are expected to be added in forthcoming releases.

new_Variation_table

Zebrafish development stage RNASeq data set

We’ve added sample-specific BAM files, splice junctions (introns) and gene models based on a range of zebrafish developmental stages and tissue samples.

Zebrafish_BAM_track

Marking a region on images

A new feature to mark a selected region has been added to the location, gene and other views. Marking can be applied by drag-selecting a region and then using the zmenu to mark it, or by clicking on a feature on an image and then using the zmenu to mark the location of the feature.

LastZ replaced TBlat for pairwise alignments

We have replaced TBlat with LastZ and recomputed 9 pairwise alignments using LastZ. TBlat was used for distantly related species as it was yielding a higher genome coverage, but over time we have optimised the LastZ parameters that enable it to give a 50-100% increase in genome coverage.

Other news

  • Human variation data updates to dbSNP (144) including variants from the Exome Aggregation Consortium (ExAC)
  • Mouse: updated to GENCODE M7 including HAVANA annotation.
  • Improved data upload form
  • Improvements to PDF export
  • Export mode for projectors and print
  • Phenotype data updated for several species, including human, mouse, rat and horse

A complete list of the changes can be found on the Ensembl website.

Find out more about the new release, and ask the team questions, in our free webinar. Wednesday 7th October, 4pm BST. Register for free.