May was a very busy month for Ensembl Outreach. Besides our customary user support activities, in 14 days, I crisscrossed the east coast of the US, attended the Biology of Genomes conference at CSHL and gave six Ensembl browser workshops at NIH-NHGRI, NIH-NIEHS and UNC-Chapel Hill.


The waltz of polypeptides by Mara G. Haseltine
(image credit:

One attendee answered our standard feedback question “Will you use Ensembl more often after this workshop?” with “Absolutely. This is the place!”

So you may wonder: the place? For what? 

Well, where should we start?

Maybe with one of the best known aspects of Ensembl: our in-house gene annotation pipeline available for more than 65 vertebrates, including our popular human, mouse and zebrafish genomes. For these three species, the Havana group also contributes manual gene annotation and, in human, the combination of the Ensembl automatic pipeline and the Havana manual annotation gives rise to the merged gene set GENCODE.


Our host Thomas Randall is about to do the honours
(image credit: Bill Quattlebaum).

But there is much more.

Ensembl is also the place for a wealth of variation data, whose effects on genes, transcripts and proteins (e.g. amino acid changes, premature stop codons, etc) are available. You can also calculate the effect of your own variants on a transcript using VEP.

Ensembl is the place for protein-coding and ncRNA gene trees, whole genomic alignments and synteny.

It is where you can get regulatory data based on the ENCODE and Roadmap projects.

Screen shot 2013-05-31 at 17.04.03

Custom variant calls in the MAGEB gene family region.

You can also visualise your BAM files from sequencing reads or display your variant calls from your variation studies.

Ensembl is for wet lab biologists, bioinformaticians, clinicians, developers, and its data can be accessed through different channel, such as the web browser and the APIs (Perl or otherwise).

Find out more what Ensembl is about: explore our tutorials page, watch our videos on youtube channel and follow us on twitter and facebook.

Frequent Ensembl APIs users can subscribe to our developers list.

All Ensembl users can ask us any question at the helpdesk.

Let us know if Ensembl is the place for you too!



The Ensembl Genomes project has celebrated its 4th year anniversary in the beginning of this week and now we are pleased to announce another milestone: Ensembl Genomes release 18.

In our latest release, we provide several new species for all divisions of Ensembl Genomes, which includes three new genomes in Ensembl Fungi (Yarrowia lipolytica, Cryptococcus neoformans and Trichoderma reesei); one new genome in Ensembl Protists (Guillardia theta); four new genomes in Ensembl Metazoa (Brugia malayiLoa loaMegaselia scalaris and Strigamia maritima); one new species in Ensembl Plants (Medicago truncatula). We have also incorporated the latest versions of prokaryotic genomes from INSDC in Ensembl Bacteria.

NOTE: The public mysql database will be available on Monday, April 29th 2013.

The detailed features of this new release are:

* Bacteria

  • Addition of cross-references to Rhea in Ensembl Bacteria;
  • New and updated genomes: 6,305 bacterial genomes in total, all deposited in ENA.

* Fungi

* Protists

  • Extended taxonomic coverage with the newly added Guillardia theta genome;
  • Updated DNA alignments and synteny for tramenopiles genomes.

* Metazoa

* Plants

  • Barrel clover (Medicago truncatula) is now available. This is the second legume genome to be included in Ensembl Plants;
  • New visualization of barley (Hordeum vulgarephysical map anchoring the gene space assembly;
  • Whole genome alignments between barley, rice and brachypodium;
  • Updated maize (Zea mays) assembly to version 3;
  • New variation datasets for barley and Brachypodium distachyon;
  • Alignments of Triticum aestivum (bread wheat) WGS, TSA and EST assemblies against the barley genome are now available. The bread wheat data can also still be visualised in the context of Bracypodium distachyon;
  • Protein domain predictions and cross references have been updated for most plant genomes;
  • New pairwise alignments between several genome sequences.

* Software migration to Ensembl 71

Have fun!

The Ensembl Genomes Team

The Ensembl Genomes project is pleased to announce release 17 of Ensembl Genomes.

Our latest release brings over 6,000 bacterial genomes into Ensembl for the first time ever!  Users can browse these data interactively on the web with our graphical interface. Programmatic access is available through the Perl and RESTful Ensembl APIs, and through publicly accessible mysql databases. Full data dumps such as DNA sequence and protein sequence in FASTA format, annotations in GTF format, and mysql dump files are also provided on our FTP sites.

Due to this sheer number of new genomes in Ensembl Bacteria in release 17, BioMart access is no longer possible and we are now working on alternative and more powerful data retrieval tools.

An increased number of bacteria (over 120 genomes) has now been used for comparative genomics analyses across a wider range of non-bacterial genomes (from both Ensembl and Ensembl Genomes projects). This is known as Pan-taxonomic Compara and can be visualised in the gene views on the browser or accessed through the Compara database.

In addition to the thousands of new bacterial genomes, we also have six new genomes in Ensembl Fungi and one new genome in Ensembl Protists. We have also improved existing genomes and added new variation, comparative genomics and transcriptomics data for several species.

The significant milestones of this release are:

* Protein families are now classified based on PANTHER and HAMAP matches. Gene trees are provided for several but not for all bacterial genomes

* Pan–taxonomic compara now contains 123 key bacterial genomes

* Gene families are now populated by dividing all proteins on all genomes by HAMAP and PANTHER classification provided by InterPro

* Six new genomes: Komagataella pastoris, Sporisorium reilianum, Pyrenophora teres, Pyrenophora tritici-repentis, Glomerella graminicola and Melampsora larici-populina

* New EST alignments for Melampsora larici-populina, Glomerella graminicola and Leptosphaeria maculans

Mycosphaerella graminicola has been renamed to Zymoseptoria tritici

* Updated gene sets for Anopheles gambiae

* Updated gene sets for Drosophila pseudoobscura (FlyBase version 2.30) and Drosophila simulans (FlyBase version 1.4)

* Updated cross-references for all 12 drosophilid species

* A missing chromosome in beetle (Tribolium castaneum) was also reinstated, and its gene models updated

* New and extensive variation dataset for Hordeum vulgare (barley)

Triticum aestivum (wheat) EST and RNA-seq alignments in the syntenic context of Brachypodium are now available

* New wheat sequence search facility based on extensive genomic and transcriptomic data aligned to Brachypodium

Solanum tuberosum (potato) EST and RNA-seq alignments are also now available

* New pairwise alignments between several plant genomes.

* A new genome of a flagellated protozoan parasite, Giardia lamblia, is now available

  • Software migration to Ensembl 70

Have fun!

The Ensembl Genomes Team

The Ensembl Outreach team is considering the possibility of going to the East Coast of the US in May to deliver Ensembl browser workshops soon after the Biology of Genomes conference in Cold Spring Harbor (May 7th-11th, 2013).

Our browser workshops are highly interactive consisting of presentations intermingled with live demos and hands-on exercises. They can be highly customised to the audience needs, which is ideally in the range of 20-30 attendees.

Ensembl workshops are delivered free of charge but we ask our hosts to pay for our instructor’s expenses which typically include flights, accommodation and subsistence.

For more details, please refer to the page below:

If you would like Ensembl to come to your academic institution or want additional details, please contact the Outreach Project Leader Giulietta Spudich ( or me (

Kind regards,


Ensembl Outreach

The Ensembl Genomes Project is pleased to announce release 16 of Ensembl Genomes.

In this release, new species include Hordeum vulgare (barley), Solanum tuberosum (potato) and Musa acuminata (banana) in Ensembl Plants; and Nasonia vitripennis (the jewel wasp) and Anopheles darlingi (Central and South American malaria mosquito) in Ensembl Metazoa. Improvements on the wheat assembly and the ability to highlight annotations in gene trees have also been released.

Barley is among the world’s earliest domesticated crop species, is the fourth most abundant cereal worldwide and is a traditional model for plant genetic research. It has a large diploid genome of about 5.1Gb in size. A draft genome sequence containing the majority of barley genes and integrated with the physical map has been recently published in Nature. The genome is now available in Ensembl Plants.

We have improved the alignment of a draft wheat assembly to Brachypodium distachyon and updated our presentation of wheat homoeologous SNPs using the Brachypodium reference.

Significant milestones of this release are:

* Ensembl Genomes displays the inferred evolutionary history of gene families in gene trees. Branches sharing specific annotation terms can now be highlighted indicating the evolution of gene function (view example) .

* Ensembl Plants updates:

* Ensembl Metazoa updates:

* Ensembl Fungi updates:

* Ensembl Protists updates:

  • repeat features added for Albugo laibachii,
  • protein features updated for all species.

* Update of pan homology comparative genomics database to include latest versions of genomes from Ensembl and Ensembl Genomes

* Updated BioMarts

* Software migration to Ensembl 69

The Ensembl Genomes Team

Dear all,

The Ensembl Genomes Project is pleased to announce release 15 of Ensembl Genomes.

This release contains two new genomes, bringing the total genomes supported to 354. Main highlights are:

* Two new species in Ensembl Fungi:  Leptosphaeria maculans and Trichoderma virens. New variation dataset is available for Puccinia graminis.

* Cross-references to other databases have been updated for Ensembl Metazoa.

* A set of homoeologous SNPs between wheat A, B and D genomes using wheat contigs aligned to Brachypodium distachyon as a reference framework is now available, a structural variation dataset for Sorghum bicolor has been imported from dGVA, and a variety of small improvements to our assembly, annotation and variation datasets have been incorporated. See the Ensembl Plants homepage for details.

* RNA-Seq data  from the ENA Sequence Read Archive have been aligned to the genomes of Pythium ultimumPhytophthora sojae and Albugo laibachii. A variation database for Phytophthora infestans using  ENA Sequence Read Archive of three different strains is now available.

* Software migration to Ensembl 68

* New Frequently Asked Questions (FAQs) are now available for all domains of Ensembl Genomes. Have a question? Check if it’s been asked before! If there is a FAQ missing you would like to see, contact us.

The Ensembl Genomes Team


The Ensembl Genomes Project is pleased to announce release 14 of Ensembl Genomes.

This release contains 11 new genomes, bringing the total genomes supported to 352. Main highlights are:


* Four new genomes in Ensembl Protists: the ciliate protozoa Tetrahymena thermophila; two oomycetes pathogens of the model plant Arabidopsis thalianaHyaloperonospora arabidopsidis and Albugo laibachii; and Entamoeba histolytica, a parasitic protozoan that infects primates, including humans.

* The addition of tomato (Solanum lycopersicum), foxtail millet (Setaria italica), and a disease resistant rice (Oryza brachyantha) to Ensembl Plants. Tomato has now been added to the pan-taxonomic comparative analysis, and synteny data is available for the Oryza genomes. Two new variation datasets covering the Maize HapMap 2 project and African rice (Oryza glaberrima) are now available.

* Two new species, Monarch butterly (Danaus plexippus) and the Postman butterfly (Heliconius melpomene), have been added to Ensembl Metazoa. Updated gene sets are now available for Aedes aegyptiCulex quinquefasciatusIxodes scapularis and Caenorhabditis elegans. The underlying genome assembly for Caenorhabditis briggsae has been updated, and a new gene set is provided. Whole genome DNA alignments (LASTZ and translated BLAT) have been prepared between C. elegans and the six other nematodes.

* Two new species in Ensembl Fungi: Sclerotinia sclerotorium and Botryotinia fuckeliana. Projection of manual GO (Gene Ontology) annotation from S. pombe and S. cerevisiae to other species based on protein homology is now available.

* Software migration to Ensembl 67

The Ensembl Genomes Team

The Ensembl Genomes Project is pleased to announce release 13 of Ensembl Genomes.

This release contains 6 new genomes, bringing the total genomes supported to 341. Main highlights are:

* Software migration to Ensembl 66

* The addition of Brassica rapa to Ensembl Plants, as well as synteny data for species in the Arabidopsis and Oryza genomes and new oligo probe mapping from the GeneChip Maize Genome Array for Zea mays

* A new species (Toxoplasma gondii ME49) added to Ensembl Protists and the genome of Trichinella spiralis is now available in Ensembl Metazoa.

* Three new genomes in Ensembl FungiMagnaporthe oryzaeMagnaporthe poae and Gaeumannomyces graminis. There is also synteny data for species in the Saccharomycetales and Hypocrealestaxonomy groups. The annotation for S. pombe has been updated to reflect the most recent content from PomBase.

EMBL-EBI and Rothamsted Research have recently announced the release of the PhytoPath portal, a joint project bringing together Ensembl Genomes with PHI-base, a community-curated resource describing the role of genes in pathogenic infection.  Genomes of pathogenic organisms from this project including the causative agents of diseases such as potato blight, rice blast and wheat rust can be found in this release of Ensembl Genomes.

The Ensembl Genomes Team