We have just updated the Ensembl genome browser and underlying databases to version 62.  We would like to share some new features with our user community.

The new Ensembl release hosts a new species- the white-cheeked gibbon (Nomascus leucogenys).  A new genebuild has been performed using the Ensembl gene annotation pipeline, incorporating both gibbon and human sequences to determine the gibbon gene set.   Compara has also incorporated the gibbon genes and genome into comparative genomics analyses, as is usual for new genebuilds.  Gibbon can now be found in gene trees, a pairwise whole genome alignment with human , and a 35-way multi-species alignment.  View these alignments in views like this one.

BigWig files are now supported through attachment of a url to Ensembl.  Click on ‘Manage your data’ at the left of a location page, and select ‘Attach Remote File’ from the new menu.

Not sure what Ensembl has to offer, or how to use our resources?  Now whenever you search for a term, hits to Help and Documentation will come up.  These may be to page-specific help, FAQs or the glossary, depending on the term.  As always, we hope our users make requests- if you can’t find what you’re looking for, let our helpdesk know.

Finally, SIFT and PolyPhen predictions are available in human variation pages, and in the popular Variant Effect Predictor (for human).  A more detailed post on these variation analyses will be coming soon, so keep your eye on the blog.

More features like the new comparative genomics navigation menu have been released, so explore and let us know what you think.  More news is available on our website.

I’m going to be blogging a bit more about the recent Ensembl 61 release and the Ensembl Genomes 8 release – lots and lots of goodies in both these releases – web site tweaks (some of the them totally critical for generating good displays), the new “favourite tracks” feature, and impressive content changes.

I’ll start today on content changes, and in Ensembl Genomes 8 there are some important genome additions. Some come from Paul Kersey’s new collaboration with PhytoPathDB – more on that in a later post – but top of my excitement has been the diversity in metazoa. The Ensembl Metazoa team has added Sea Urchin, Sea Anemone, the rather weird primitive animal, Trichoplax adhaerens (also called the “carpet” organism) and the blood fluke, Schistosoma mansoni. The motivation of bringing these organisms in is to broaden our phylogenetic tree and comparisons we can provide across all of life. So for example for the drosophila Twist Gene one can now see the deep tree for this across metazoa. For example, there is a deep ortholog to Trichoplax which seems to predate the split of some of these Helix Loop Helix proteins, whereas there are other members of the family which have a paralog in Trichoplax meaning that there seems a fundamental split in this developmentally key transcription factor. This is just one of many interesting gene trees that one can look at using this resource…

Happy browsing/data mining!

The Ensembl Genomes Project is pleased to announce release 8 of Ensembl Genomes (http://www.ensemblgenomes.org/).

The main highlights of this release are:

  • Software migration to Ensembl 61
  • New Pan Compara database consisting a selection of vertebrate genomes from Ensembl 61 and genomes from Ensembl Genomes 8 (incorporating 8 new species), giving a species total of 313.
  • 3 oomycete genomes added to Ensembl Protists, including Phytopthora infestans and Phytopthora ramorum responsible for potato blight and Sudden Oak Death disease respectively.
  • 5 genomes added to Ensembl Metazoa, including Strongylocentrotus purpuratus (Echinodermata) (sea urchin), Apis mellifera (Arthropoda) (honey bee) and Nematostella vectensis (Cnidaria) (sea anemone).

For further details please visit the individual homepages:
http://bacteria.ensembl.org
http://protists.ensembl.org
http://fungi.ensembl.org
http://plants.ensembl.org
http://metazoa.ensembl.org

Ensembl 61 has gone live. In displays like gene summary and region in detail, favourite tracks can be turned on. To do this, open the configuration panel (click on configure this page in the left hand menu). Activating a star by clicking on it will place that track in the favourites menu (shown by an arrow in the diagram).

Hover over any track name in these views to view information about the data, change the display, or turn the track off. Turning on tracks must still be done with the configuration panel.

We hope this helps ease of navigation. Read about other updates in 61, such as our new species, Turkey, in the news. The Ensembl Team

We are pleased to announce new documentation, specific for describing the gene annotation methodology and results for particular species.

Ensembl gene annotation is a multi-step process which usually takes several months to complete for one species, and is termed the genebuild. In order to provide our users with more information on the data resources used and decisions made during the genebuilding process, we are introducing a new genebuild summary PDF document for each new genebuild, starting from early February 2011 with Ensembl release 61. Each document includes details on not only the alignment programs and data filtering parameters used, but also statistics on the number of protein/cDNA/EST sequences used at different stages of the genebuild. For example, users will be able to find out how many protein sequences were retrieved from public repositories (RefSeq and UniProt) at the beginning of the genebuilding process, how many of these proteins aligned to the genome by various algorithms at different stages of the build, and how many remain in the final gene set as supporting evidence for genes. For human, mouse and zebrafish, the process of merging Ensembl and Havana annotations is also explained.

The genebuild summary will be available for six species: the Anole lizard, Marmoset, Mouse, Panda, Turkey and Zebrafish. More genebuild summaries will be available in the future when genebuilds of existing species are being updated, or when new species are being annotated. You can download the document via a link found near the bottom of the “Description” page for each species. Just click on the species of interest from the home page, to open its description page.

The Ensembl Genomes Project is pleased to announce release 7 of Ensembl
Genomes (http://www.ensemblgenomes.org/).

The main highlights of this release are:

* Software migration to Ensembl 60

* 66 new genomes added for Ensembl Bacteria and updated functional
genomics databases for Escherichia/Shigella and Staphylococcus

* Puccinia graminis f sp tritic genome added to Ensembl Fungi

* Zea mays and Physcomitrella patens genomes added to Ensembl Plants,
new variation datasets for Oryza sativa and updates to the
functional genomics databases for Oryza sativa indica, Oryza sativa
japonica and Arabidopsis thaliana

* Acyrthosiphon pisum genome added to Ensembl Metazoa

Please see the individual homepages for more detailed information:
http://bacteria.ensembl.org
http://protists.ensembl.org
http://fungi.ensembl.org
http://plants.ensembl.org
http://metazoa.ensembl.org

The Ensembl project is pleased to announce release 60 of Ensembl. Highlights of this release are:

* New species – Giant Panda
* New assemblies and genebuilds for zebrafish and rabbit
* Improved design of the Variation Table
* New display for GO terms
* Improved navigation on Region in Detail, including autocompletion of gene display names (e.g. HGNC)

For more information visit:
http://e60.ensembl.org/info/website/news/index.html

The Ensembl Team

The Ensembl Genomes Project is pleased to announce release 6 of Ensembl Genomes.

The highlights of this release are:

  • Software migration to Ensembl 59.
  • Two new rodent malarial genomes: Plasmodium berghei and Plasmodium chabaudi and an update to the Plasmodium falciparum gene-set in Ensembl Protists.
  • Variation BioMarts added for Plasmodium falciparum, Saccharomyces cerevisiae (Ensembl Fungi), Arabidopsis thaliana, Oryza sativa indica, Oryza sativa japonica, Vitis vinifera (Ensembl Plants), Anopheles gambiae and Drosophila melanogaster (Ensembl Metazoa).

See the individual homepages for Bacteria, Protists, Fungi, Plants and Metazoa for more information.

With the recent release of version 58, we are pleased to announce a few features designed to make genome browsing simpler. Have you noticed the search function in the individual tables? For example, in the Gene tab: “Variation Table“, search for a variation ID.

Have a look at our sortable tables. In this example, we can sort by ID, Type, location, allele, source, or validation status. Use the arrows next to the column title to choose a new column to sort by. Searchable, and sortable, tables can also be found for orthologues, variations across individuals or strains, protein motifs and domains, and more.

If you are browsing a genomic region, you may have noticed that the Location tab: “Region in detail” view has a sliding zoom bar. Zoom out to view neighbouring genes and features, or zoom in to your favourite exon.

We have been developing a pipeline to build gene models using only RNA-seq data. For release 58 we have added a preliminary set of Zebrafish RNA-seq gene models with an intention to integrate this new source of evidence into a full genebuild soon.

Zebrafish transcriptome data from 9 tissues were used to build a set of genes and splice variants. For each loci we chose the variant with the highest read support to display, further details on the process are available here.
To display the genes, go to the Region in Detail, or Region Overview. Use the “Configure this page” button and select “RNASeq Genes” from the “Genes” menu. The “Supporting DNA Alignments” menu contains supporting exon and intron features from each of the nine tissues. Clicking on these features in Ensembl location pages shows a simple read count for the intron features and RPKM values for transcripts and exons, (reads per kilobase of model per million mapped reads, from Mortazavi Nature Methods 2008).

This is a first attempt at visualising tissue specific read depth and alternative splicing, which we hope to develop further in the future.