What is new?

  • Molecular and biological information from PHI-base¬†(version 4.0) for thousands of genes in¬†Protists, Bacteria¬†and Fungi¬†that are involved in pathogen-host interactions
  • Small non-coding RNA genes in the diatom¬†Phaeodactylum tricornutum (Rogato et al. 2014)
Phaeodactylumtricornutum_1_1043340_1060257

Small non-coding RNA genes described by Rogato et al (2014) in the diatom P. tricornutum are now available in Ensembl Protists. Hover over the ncRNA genes track (genes coloured in light purple) for more information.

  • Software update: version¬†83 of the Ensembl project¬†for all our¬†30,586¬†genomes from 5,319¬†species

Other news

  • Improved¬†image export functionality from our¬†websites

Check our ‘New image export option in Ensembl‘ post for more details.

Slide1

The images in Ensembl Genomes browser websites can be exported in different formats and resolution. Choose the one that suits you best and click on ‘Download’.

  • Protein domains for Protists, Metazoa, Fungi and Plants recalculated with¬†InterProScan¬†(version 54.0)
  • Updated BioMart: Protists, Metazoa, Fungi and Plants
  • Updated gene trees in Plants

Check out all the changes on our Ensembl Genomes website.

Any questions or comments? Email us.

What is new?

  • Pairwise alignments for more than 50 plant genomes including¬†potato and tomato, cacao and¬†grape, and several Oryza sp
  • Newly available genomes from INSDC into our ever expanding Protists and Fungi divisions
  • Cross-references of genes in Fungi and Protists to the Pathogen – Host Interaction Database (PHI-base)
  • Additional 6,806 bacterial genomes imported from ENA
  • Protein domain information from InterProScan (version 5.14-53.0)

Genomic alignments

New pairwise alignments have been extended to additional plant genomes and can be viewed in our browser website. The alignment text can be downloaded in different file formats.

Comparing the genomic region of the Patatin gene in two plant species, potato and tomato.

Comparing the genomic region of the Patatin gene in two plant species, potato and tomato.

In addition to the graphical view, whole genome alignments can be retrieved via our FTP or programmatically using our APIs.

View the complete list of genomic alignments available in this new release.

Marking your favourite region, gene, exon, or variant

You can now mark a selected region when browsing the Ensembl Genomes websites. Drag and select your favourite gene and use the pop-up window to mark it. You can also click on the gene itself to mark the location of it.

Slide1

Highlighting a genomic region in Ensembl Genomes is now available in many of our views.

Other news

  • Updated BioMart: Protists, Metazoa, Fungi and Plants
  • Updated peptide comparative genomics
  • New assemblies and annotations for existing¬†fungal species
  • Gene families available in¬†Protists

A complete list of the changes can be found on the Ensembl Genomes website.

Any questions or comments? Get in touch.

What is new?

pomGO

Gene Ontology (GO) annotations are some of the updates for S. pombe genes in Ensembl Fungi release 28

Slide1

Protein domains updated from InterProScan for Fungi, Protists, Metazoa and Plants in this new release

  • Updated BioMart: Protists, Metazoa, Fungi and Plants.

Any questions or comments? Get in touch.

What is new?

  • Expansion of Protists and Fungi with hundreds of annotated genomes
  • Variation data for bread wheat, rice, Aedes aegypti, and Ixodes scapularis
  • Whole genome alignments for O. longistaminata and T. cacao
  • Non-coding RNA gene models in Bacteria
  • New assembly of tomato (version 2.50)
  • Full support for UCSC Track Hub format for hosting your own data in Ensembl

Expansion of Fungi and Protists

All protist and fungal genomes whose sequence and annotation are complete and submitted to the International Sequence Database Consortium (INSDC) have now been included in Ensembl Genomes. Future releases will continue to be updated with all newly submitted sequences.

EG_species

Mushrooms, mould, parasites of animals and plants: just to name a few genomes in the comprehensive and varied collection of new species in Fungi and Protists.

Ensembl Fungi now contains 408 genomes from 271 species, with 355 new genomes from 236 species included in this release.

Ensembl Protists now contains 133 genomes from 91 species, with 101 new genomes from 66 species.

The new genomes are available on our websites, MySQL databases, and APIs (REST and Perl). We are currently working on making them available in BioMart and expect this to be available in release 28.

A representative selection of 57 protists and 191 fungi genomes have been added to our comparative genomics analyses, making gene trees and homology calls available.

Screen shot 2015-06-15 at 16.49.10

Gene tree of the RBP gene in Leishmania mexicana, one of the many new genomes in Ensembl Protists, and its 12 orthologues.

New variation data for bread wheat

Variation data provided by the HapMap consortium is now available in Ensembl Plants for bread wheat. The data was generated by re-sequencing 62 diverse wheat lines. In total 1.57 million SNPs and 162 thousand small indels were identified across the 21 chromosomes of bread wheat. Moreover, the genotypes of 475 individuals have been added to the Axiom 820K SNP Array from CerealsDB.

SNPs and short indels from the wheat HapMap and CerealDB annotated with VEP can be viewed in our Ensembl Plants browser.

Other news

  • Updated gene models in¬†Metazoa, Protists and Fungi
  • Updated comparative genomics across all divisions
  • New probe data for barley
  • Updated BioMarts

A complete list of both new and updated date can be found on our website.

Any questions or comments? Get in touch.

Highlights

Screen shot 2015-03-25 at 11.01.41

Explore the new variation data in the plant pathogen Z. tritici. This variation database was constructed based on as study SRP017760 downloaded from ENA.

Genome comparisons for Triticeae and related species

Whole genome alignments between seven pairs of Triticeae genomes, including the bread wheat A, B and D component genomes, Triticum urartu (the A genome progenitor), Aegilops tauschii (the D genome progenitor), and barley are now available. The alignments were obtained using ATAC and the statistics on genome and coding exon coverages can be found on our website. See the ATAC results for the comparison between T. urartu versus A. tauschii genomes.

MAF files on our FTP sites

In response to several requests from our users we now provide the pairwise alignments as MAF files. These can be found on the FTP download site of all Ensembl Genomes divisions. See an example of this data in Ensembl Metazoa.

Other news

A complete list of both new and updated date can be found on our website.

Get in touch if you have any questions or comments.

The Ensembl Genomes team

Are you looking for whole genomes, protein sequences, alignments or other genome-wide data from Ensembl?

Look no further; our FTP site is the place for you:

  • Download¬†our data from the¬†current release only (i.e. Ensembl 78)
  • Download¬†our data from current and¬†previous releases (including GRCh37)

These are some of our data that can be downloaded in bulk and for free; file types are described in brackets:

  • DNA, cDNA, CDS, ncRNA sequences (FASTA)
  • Annotations of our coding and non-coding genes (GTF)
  • Annotation of regulatory elements for the human and mouse genomes (GFF)
  • Variation data (VCF) for more than 20 Ensembl species
  • RNASeq reads (BAM)¬†aligned against¬†25 genomes
  • GERP scores to identify constrained elements (BED)
  • Alignments of resequencing data for several species (EMF)
  • Multiple and pairwise genome alignments (MAF)
  • Ensembl databases for local installation (MySQL)

How can the Ensembl FTP foster research?

Let’s look at coiled-coils, simple dimers in protein¬†sequences found in¬†many species and believed to enable protein-protein interaction in a variety of biological processes.

Slide1

Structure of coiled-coil domain from PDBe. Homohexameric assembly by Li et al. (2014)

Coiled-coil domains differ immensely from their globular counterparts, and distinct evolutionary constraints on them are expected. How conserved are coiled-coils? What has driven their evolution?

Intrigued by these questions, Surkont and Pereira-Leal (2015) set out on an journey to compare different protein sequences across several vertebrates, and the yeast. They show that substitution patterns do differ in coiled-coil versus globular regions, and they developed an evolutionary model to improve the detection of coiled-coils by homology, and their phylogeny inference.

Where did Surkont and Pereira-Leal find these proteomes for their investigation? In our FTP site.

Why not explore the Ensembl¬†FTP¬†site to¬†see what we’ve got in store for you?

Any comments or questions, just get in touch.

 

Highlights

  • New fruitfly assembly BDGP6
  • New bread¬†wheat assembly, including the BAC assembly of chromosome 3B
  • New variation datasets for barley
  • New assemblies and gene annotation¬†for two plant pathogens
  • Sequences from RNAcentral against selected genomes

Variation data in barley

Slide1

We’ve added three new variation datasets for Hordeum vulgare in this release:

  • Five¬†million variant calls from POPSEQ of cv. Morex x cv. Barke progeny
  • Over¬†six million variant calls from POPSEQ of Oregon Wolfe barley
  • Over 7k¬†SNP assays from the iSelect chip

Non-coding RNA sequences from RNAcentral

Slide1

Following the launch of the new ncRNA sequence resource RNAcentral, we now show alignments of non-coding sequences against selected genomes. Look for the “RNAcentral sequences” track in the ‘Region in detail’ view and¬†turn the¬†data track on.

New genomes

  • Additional 1,939 completely sequenced and annotated bacterial genomes from INSDC

Other news

Get in touch if you have any questions or comments.

The Ensembl Genomes team

We have just released the latest update of Ensembl Genomes.

The highlights of our new release are:

New Genomes

Ensembl Metazoa

Ensembl Bacteria

  • 4,310 new genomes: 20,254 genomes in total, including bacteria and archaea

New Data

Ensembl Plants

Slide1

Variation data from the 150 Tomato Genome project now available in Ensembl Plants.

Ensembl Fungi

Updated Data

Ensembl Fungi

Ensembl Fungi, Protists, Metazoa, and Plants

We have also updated all BioMarts, gene trees, protein domain classification using InterPro version 48.0 and InterProScan version 5, and comparative genomics.

Get in touch if you have any questions or comments.

The Ensembl Genomes team

Are you a rat person, i.e. do you work on rat?
Are you joining the 9th Rat Genomics and Models conference in December?
Could you spare another day after the meeting before heading back home?

If so, this post is for you!

Ensembl is extremely pleased to announce that for the first time ever we will be running a workshop specifically targeted at the rat community! The timing could not be more perfect as we have just released the first set of golden genes in rat, i.e. the merge between the Ensembl automatic and the Havana manual annotation.

Slide1

The rat genome and golden genes in Ensembl.

The ‚ÄėEnsembl workshop: browser and tools for accessing the Rat genome‚Äô will consist of talks by different members of the Ensembl team, live demos and hands-on exercises.

Registration is free on a first come, first served basis by filling out this form.

The only pre-requisites are a general knowledge of molecular biology and genomics, in addition to familiarity with web-based genome browsers.

The detailed program is depicted below:

  • Day I 04/12/14 (14:00-18:00)

Ensembl Project: Introduction
Ensembl Browser: Live demo
Ensembl Tools: BLAST/BLAT, BioMart

  • Day II 05/12/14 (09:30-13:30)

Ensembl Genebuild: Annotating rat genes
Ensembl Variation: Sequence variants in the rat genome
Ensembl Tools: VEP, REST
Workshop wrap up and feedback

Please note that the attendees of the 9th Rat genomics and models conference will be prioritised for this workshop. If there are still spaces available we will open attendance to a wider audience. The maximum number of participants is 30.

The workshop will take place in the beautiful grounds of Wellcome Trust Genome Campus in Hinxton.

428110_10150677227503745_795188569_n

The Wellcome Trust Genome Campus on a snowy day in winter.

 

If you are working on large sets of genomic data or carrying out detailed and complex bioinformatic analyses, keep on reading.

Do any of the following thoughts ring a bell for you?

  1. I’d love to fetch protein coding genes from my species of interest.
  2. It’d be great to be able to get orthologous of the genes I’m working on.
  3. I want to find out if my sequence variants fall in regulatory regions and I want to know it now!

If so, the Ensembl Perl APIs are the the way to go!

We can teach API workshops at your institution

We offer Perl API workshops on a regular basis. Our last off-site course was at the Roslin Institute in Edinburgh. We had a whopping 26 attendees. Four members of our Ensembl team, namely Magali Ruffier, Laurent Gil, Thomas Juettemann, and Stephen Fitzgerald delivered the modules on the Core, Variation, Regulation and Comparative Genomic aspects of the Ensembl database. Have a look at some of the feedback we had:

  • ‘Skills from the workshops have opened up my options for accessing Ensembl data which will allow me to more efficiently cross compare information’
  • ‘I will be retrieving specific data more efficiently now’
  • ‘It is quite easy to retrieve the whole set of exons from the genome with several lines of Perl script’
  • ‘The regulatory features can be easily fetched by chromosomal location and that helps me looking at over-expressed regions in my RNA-Seq experiments’
859202_10100811342969591_1645735500_o

Thomas Juettemann from the Ensembl Regulation team and his happy crowd!

How can you host an API workshop at your institution? Just get in touch.  We request that travel, accommodation and subsistence costs of the instructor(s) are reimbursed by our hosts.

API workshop in Cambridge, UK

If you are in or around the UK at the end of this year, you may want to sign up for our next API course at the University of Cambridge. It’ll take place on December 2nd-5th and places are still available. For more information and registration please have a look at the course description.

If these dates are no good, don’t despair. We have got a couple of API courses already lined up for 2015. Check our calendar¬†to see where we are going next.

More information on our APIs

The Ensembl project provides a comprehensive set of APIs (Application Programme Interfaces) that allows our users to access genome wide information rather efficiently and quickly. Our APIs are of two types: Perl and REST.

Find more about the Ensembl Perl APIs on our help and documentation page and watch our filmed course. For tips on how to install the API via GIT and FTP, have a look at our youtube video.