The new Ensembl release includes a new view for SNPs and other genomic variations. It shows the alignment of the polymorphic position together with 10 base pairs of sequence up- and downstream. The user can choose among all available multiple alignments. Polymorphic positions in the other species are also shown.

This is very useful for looking at ancestral alleles, especially in combination with our EPO alignments as they include the inferred ancestral sequence. Although dbSNP provide predicted ancestral alleles for human SNPs, these are based on the chimp sequence only. In several cases, the ancestral sequence inferred from the multiple alignment is in disagreement with the chimp sequence like in this example. Using multiple alignments gives better results and more confidence to the calls.

The Ensembl project is pleased to announce release 54 of Ensembl. Highlights of this release are:

  • New Zv8 zebrafish assembly;
  • Comparative alignment text displays for variations and regions;
  • Ability to add personal notes to any Gene or Transcript.
For more information visit:

 

Along side this release we are also releasing a new version of the pre site. This now includes:

  • The GRCh37 human assembly released in February 2009, with preliminary analyses included;
  • The callJacc3 marmoset assembly.

 

The Ensembl Functional Genomics (eFG) environment has been expanded to incorporate array mapping functionality. Historically, arrays from different vendors have been processed in similar, but non-identical ways due to differing array designs, with the output being stored in the core database. The ‘arrays’ environment unifies this process within the eFG database to provide a new standardised array mapping procedure for all array formats. This involves a two step process whereby probe sequences are aligned both to genomic and transcript sequences, and then subsequently transcripts are annotated with xrefs(DBEntries) dependant on the quality of the probe alignments around a given transcript locus.

The ‘arrays’ environment provides easily accessible and interactive command line functions to help run and administer the array mapping pipeline. Recent developments include broader array format support and multi-species capability, along with capture of much more detailed mapping information. This data has yet to be seen in the Ensembl browser, but from release 55 we will start redirecting the web displays to use the eFG data, with a view to developing a more detailed ‘Probe’ panel at some point later in the year.

We will endeavour to provide alignments and mappings of all popular arrays, for all others we invite you to try out the eFG ‘arrays’ environment. For more information check out(literally):

ensembl-functgenomics/docs/array_mapping.txt

Or see it online here.

If you have any questions, please mail ensembl-dev@ebi.ac.uk

The Ensembl Genome Browser project is pleased to announce a workshop on 22 May as a satellite meeting of the European Human Genetics Conference in Vienna, Austria. This full-day workshop is aimed at geneticists and life scientists, and will explore genes, variations, and comparative information using the browser’s new interface released Nov, 2008. An introduction to large-scale data retrieval with BioMart will be included. We will also feature brief introductions into the European Genotype Archive (EGA) and the 1000 Genomes Project. The format of our browser workshops are described on our outreach page.

The course on 22 May is held at a central location- the Vienna University Computer Service.

The workshop is free, however limited places are available. Please register if you will be attending.

For May we have the following Ensembl events:

29 April – 1 May: Ensembl Developers workshop at the University of Cambridge, Cambridge, UK
8 May: Browser workshop at Imperial College, London, UK
11-13 May: Ensembl module in the Wellcome Trust Open Door Workshop – Working with the Human Genome Sequence, Hinxton, UK
11-15 May: Ensembl module in the EBI hands-on training A walktrough EBI Bioinformatics Resources, Hinxton, UK
12-13 May: Ensembl module in the EBI roadshow at the Université Victor Segalen Bordeaux 2, Bordeaux, France
19-21 May: Ensembl module in the EBI roadshow at the Universidade de Santiago de Compostela, Santiago de Compostela, Spain
22 May: Browser workshop at the European Human Genetics Conference, Vienna, Austria
26 May: Ensembl Developers workshop at the VIB Flanders Interuniversity Institute of Biotechnology, Ghent, Belgium
27-28 May: Browser workshop at the Erasmus MC Molecular Medicine Postgraduate School, Rotterdam, The Netherlands

For details about these and other upcoming events, please have a look at the complete list of Ensembl training events.

Today the long-awaited Ensembl Genomes went live! This is a ‘sister project’ focusing on those species that aren’t part of Ensembl, i.e. non-vertebrates. Please have a look at what the Ensembl Genomes team have to say about it themselves:

“We are delighted to announce the forthcoming release of Ensembl Bacteria, Ensembl Protists and Ensembl Metazoa, the first sites to be launched as part of the EBI’s “Ensembl Genomes” project to extend the use of the Ensembl browser to non-vertebrate genomes.

These following site are available:

http://bacteria.ensembl.org
http://protists.ensembl.org
http://metazoa.ensembl.org

Additional sites for fungi and plants are in development and will be launched during the summer of this year.

In the Ensembl Genomes project, we are aiming to do two things: firstly to work with particular communities to support the bioinformatic analysis of genome-scale data; and secondly, to provide an integrative portal to data from species of scientific interest from across the taxonomic space. In pursuit of both these aims, we will re-use and extend the proven Ensembl software system, that has been developed by EBI and the Wellcome Trust Sanger Institute in the context of vertebrate genomics.

As with Ensembl, Ensembl Genomes will provide access to DNA and protein sequence, positional and functional annotation of protein-coding and non-protein coding genes, repeat analysis and other features and statistics. An interesting feature made available with the release of Ensembl Genomes is the inclusion of a multi-way comparative genomic analysis performed using a selection of species from bacteria to humans, and the production of gene trees showing the inferred ancestral relationships within deeply conserved protein families. Comparative resources are also provided at a narrower level (for example, DNA and protein-based analyses of individual bacterial clades). In partnership with collaborators, we are working on capturing gene expression, and population-scale variation data, in a number of contexts. More generally, we anticipate the ongoing enrichment of these resources through the integration of increasing quantities of high throughput data now becoming routinely available for all species.

Ensembl Genomes will provide access to data through the usual routes supported for vertebrate data; web-based browser, FTP site, programmatic API, DAS, and BioMart-style data warehouse; as well as text and sequence-based search.

We look forward to working with you as future producers and consumers of data. More information about the project is available at http://www.ensemblgenomes.org. We will be happy to receive any feedback you might wish to offer us at helpdesk@ensemblgenomes.org.”

Though the overall response has been good, a few Ensembl users are finding it difficult to switch from the old interface to the new browser launched Nov, 2008. For those users, functionality has not been lost. You should still be able to do the same tasks as before in a faster interface.

We will post a series of tips to show you how to make the switch from the old interface to the new. If you still have trouble, please watch our video tutorial: Browsing Ensembl.

TIP: I want to use ExonView. Where is this now?

To view the full genomic sequences, exons and introns, go to any transcript. (Exons are transcript information. To see the exons page, go to a transcript tab, not the gene tab.) Click on the ‘Exons‘ link under ‘Sequence’ at the left of any transcript page.

To show the full introns, click on ‘Configure this page‘ at the left. Select ‘Show full intronic sequence’. Click ‘Save and Close’ at the top right corner of the menu window.

Still can’t find what you’re looking for? Email us.

For April we have the following Ensembl events:

31 March – 1 April : Browser workshop at the IGC, Oeiras, Portugal
2-3 April: Ensembl Developers workshop at the IGC, Oeiras, Portugal
16 April: Demo for the National Genetics Reference Lab, Manchester, UK
17 April: Browser workshop at Imperial College, London, UK
22 April: Browser workshop at Imperial College, London, UK *postponed to 8 May*
27-29 April: Ensembl module in the EBI roadshow at the University of Iceland, Reykjavik, Iceland
27-29 April: BioMart module in the EBI hands-on training Programmatic Access to Biological Databases (Java), Hinxton, UK
29 April – 1 May: Ensembl Developers workshop at the University of Cambridge, Cambridge, UK

For details about these and other upcoming events, please have a look at the complete list of Ensembl training events.


Ensembl just updated the live site and underlying databases to
version 53.

Some new features include ‘Active Tracks’ and a searchable ‘Configure this page’!

Go to any region of the chromosome.

Click ‘Configure this page’ at the left.

‘Active tracks’ allows you to see (and deselect) all tracks that are turned on.

‘Search display’ allows you to search for tracks in the menus. In this example, we searched for UniProt. Tracks from different menus appear.

For more updates, including new species, variations, and Amazon Web Services, see the news.

We are already working on our next release (out late in April 2009) which will come with the following:

Data

Zebrafish
We will be releasing a new genebuild for zebrafish (with updated repeat masking) based on the latest assembly Zv8. Thus, we’ll have a new gene set (with new probeset mappings).

Horse
A gene patch (fixing split genes) based on human/mouse 1:1 orthologues. Therefore we have a new gene set.

Human

  • cDNA update
  • New ensembl-vega merge delivering a “new gene set”.

Mouse

  • cDNA update
  • New ensembl-vega comparison, delivering a “new gene set” .

New gene sets (ncRNA genes) for several low coverage genomes:
Sloth (Choloepus hoffmanni), armadillo (Dasypus novemcinctus), kangaroo rat (Dipodomys ordii), elephant (Loxodonta africana), hyrax (Procavia capensis), megabat (Pteropus vampyrus), tarsier (Tarsius syrichta), dolphin (Tursiops truncatus) and alpaca (Vicugna pacos).

Mart

  • New functional genomics mart

Core
Minor schema changes

  • cDNA update
  • Update versions (patch_53_54_a.sql)
  • Increase size of oligo_probe.name (patch_53_54_b.sql)
  • Increase size of external_db.db_name (patch_53_54_c.sql)
  • Move analysis_id from identity_xref to object_xref (patch_53_54_d.sql)
  • Increase size of analysis.logic_name (patch_53_54_e.sql)


Variation and Functional Genomics

  • Schema change to source table to add description column for web display
  • Updated zebafish database
  • Import Illumina data whenever available
  • Recalculate consequence type for mouse regulatory feature
  • eFG array mapping: Human, Mouse, Rat, Drosophila
  • Affymetrix (UTR/IVT + ST), Illumina (WG)

New mouse DNAse data to support the first Mouse RegulatoryBuild

Code Other

  • Amazon EC2 public datasets updated
  • New GO database (ensembl_ontology_54) and API
  • Changing default behaviour of TranscriptAdaptor
  • Translation attribs modified
  • Remove entries with spaces from species.classification
  • Gene name and xref projections


Pairwise alignments

Update the pairwise alignments for zebrafish (Danio rerio):

  • human-zebrafish translated BLAT-NET
  • mouse-zebrafish translated BLAT-NET
  • rat-zebrafish translated BLAT-NET
  • chicken-zebrafish translated BLAT-NET
  • frog-zebrafish translated BLAT-NET
  • tetraodon-zebrafish translated BLAT-NET
  • fugu-zebrafish translated BLAT-NET
  • medaka-zebrafish translated BLAT-NET
  • stickleback-zebrafish translated BLAT-NET
  • Ciona savignyi-zebrafish translated BLAT-NET
  • Ciona intestinalis-zebrafish translated BLAT-NET

Add new alignments for medaka:

  • human-medaka BLASTZ-NET (imported from UCSC)
  • mouse-medaka BLASTZ-NET (imported from UCSC)


The following files will be available for download:

  • EMF dumps for GeneTrees
  • EMF dumps for EPO and PECAN multiple alignments
  • BED files for 31 way GERP constrained elements
  • BED files for 12 way GERP constrained elements

Homologies and families

  • 49-way GeneTrees and Homologies, with new/updated gene sets and assemblies.
  • Multiple Sequence Alignments with consistency-based MCoffee
  • Meta-aligner (mafftgins+muscle+kalign+probcons).
  • Pairwise gene-based dN/dS calculations for high coverage species pairs.
  • Updated MCL families including all Ensembl AS isoforms and latest UniProt Metazoa.
  • Multiple Sequence Alignments with MAFFT