This work has been postponed and the Ensembl site is now running as normal. We will update the blog informing you when this work is rescheduled.

The Systems team at the Sanger Institute need to carry out emergency maintenance to some of the database servers running the ensembl websites from 9.30 am tomorrow morning. This means that we will be running the live Ensembl site with reduced functionality and the Pre, Vega, NCBI36 and archive sites will temporarily be off line.

Blast, Biomart, user-logins and search will all be unavailable. The entire Ensembl website should be considered as “At risk”, however should the main site go down the two mirror sites, uswest.ensembl.org and useast.ensembl.org should still be available although these too will have reduced functionality.

We apologise for inconvenience this may cause and will work to ensure the disruption occurs for the shortest time possible.

The Systems team at the Sanger Institute have resolved the network problems and http://www.ensembl.org is now back up and running as normal.

The Sanger Institute is currently experiencing networks problems which have taken the main Ensembl site offline. Please use one of our two mirror sites http://uswest.ensembl.org and http://useast.ensembl.org We will update the blog when the main site is working again, we apologise for any inconvenience caused.

We are currently experiencing a power failure which has taken the main Ensembl site offline. Our mirror site uswest.ensembl.org is still running so please use this until we have the main site back online.

The Ensembl project is pleased to announce release 56 of Ensembl (http://e56.ensembl.org/). Highlights of this release are:

Reintroduction of our multi-species views. Alignments (image), formerly alignsliceview, shows pairwise or multiple alignments from the Ensembl Compara database, highlighting any gaps in the alignment.

Multi-species view, formerly known as multicontigview, displays pairwise alignments without gaps; multiple pairwise alignments can be configured to create a multiple alignment display. As well as genes, other types of features such as regulatory features can be displayed in this view, making this a very useful display for comparative genomic analysis.

A new tab has been added in release 56 based on a Regulatory Feature object. This will enable better display some of the data underlying the Ensembl regulatory build. The new pages are accessed from the gene displays by clicking on the ‘Regulation’ link in the left-hand menu and then clicking on a regulatory stable ID in either the image popup menus or the table.

From release 56, users can upload wiggle plot data in WIG and bedGraph formats and view this data on various location-based views. At the moment, only a single style, “wiggle”, is available on Region in Detail, whereas a selection of density plots are available on whole chromosome and karyotype images. In addition, Region in Detail now supports greyscale rendering of BED scores via the useScore parameter in the file, and rendering of features in different colours via the itemRgb parameter and per-feature values.

New data in this release includes gene sets on two new species (Pig and Marmoset) and a new gene set on the existing Rat Rnor3.4 assembly. Also in this release is an updated human gene set which includes all the Havana manual annotation in the merge with the Ensembl automatic annotation set. This set represents the Encode project GENCODE 3b gene set. Also included is a new human variation database based on dbSNP 130 and mapped to assembly GRCh37.

For more information on these and other new features in this release visit:

http://e56.ensembl.org/info/website/news/index.html

We are currently working on our next release which is due at the end of June 2009 and will contain the following:

Data

Human GRCh 37
We will be releasing a new genebuild for human based on the latest assembly GRCh37 from the Genome Reference Consortium. A preliminary version of this assembly is available now in Ensembl Pre! Due to the new assembly we will have:

  • Updated repeat masking
  • New probeset mappings
  • cDNA update
  • A new ensembl-vega merge delivering a new gene set
Wallaby
Ensembl 55 includes the 2X genome for Tammar Wallaby (Macropus eugenii), this will be a projection build similar to our other 2X species.

C. elegans
We will also include an import of the WormBase release WS200 database for C. elegans.

Anole lizard – A gene patch incorporating the gene set provided by Chris Ponting at Oxford University means that we have a new gene set for the green anole lizard (Anolis Carolinensis).

Mouse – The mouse cDNA alignments have been updated.

Zebrafinch – There will be an updated gene set for the 6X zebra finch genome.

Zebrafish – Non-coding RNAs will be added to the Zv8 zebrafish assembly and there will also be some changes to protein coding gene models and new repeats and expression patterns.

Core

Schema Changes

  • Patch to update versions (patch_54_55_a.sql). * Add the missing types to go_xref (patch_54_55_b.sql).
  • Add new table dependent_xref (will hold the dependencys for the xrefs, i.e. if an EMBL entry come from a uniprot entry this relationship will be in the table)( patch_54_55_d.sql).
  • Add new tables for alternative splicing/transcript events (patch_54_55_c.sql).
  • Add new column ‘is_constitutive’ to the exon table (patch_54_55_e.sql)

Xrefs
Xrefs will be run for Human, Macacca, Opossum, Chimp, Chicken, Dog and Mouse (including Fantom Xrefs).

Ontology database schema and tools
The ensembl_go_NN databases are no longer being built. Instead we are replacing this with the ensembl_ontology_NN database which may be connected to using the core API.

Assembly mapping
Some of the databases will contain mapping coordinates between current and previous assemblies:

  • human: mapping from current GRCh37 to NCBI36, NCBI35 and NCBI34
  • mouse: mapping from current NCBIM37 to NCBIM36, NCBIM35 and NCBIM34
Other changes
  • API support for alternative transcripts/splicing events will be added
  • API support for constitutive exons will be added
  • Deprecated API modules will be removed
  • All slices will be created using the new_fast method from the SliceAdaptor to improve performance
  • seq_region seq edit support will be added. Seq_edits can already be stored and retrieved but these were not used in getting the sequence data. This will be changed so that “_rna_edit” attributes in the seq_region_attrib table will be used and the sequence changed.
  • MySQL and FASTA dumps will be copied to Amazon Public Datasets project
  • Gene name and xref projections

Mart
  • New functional genomics mart * A new Probe section added to Ensembl mart
  • New ontology mart
  • Constitutive exon information will be re-added to Ensembl mart

Variation
  • There will be a new human variation database generated by mapping NCBI36 coordinates to GRCh37 (using dbSNP 129)
  • Illumina array data for SNP/CNV is to be added
  • Transcript variations for Zebrafish and Zebrafinch will be reculated to include information from the new gene sets
  • Schema change – added a call to get consequence_type
Functional genomics
  • Human Regulatory Build will be updated using the GRCh37 assembly
  • Probe alignment and transcript annotation for all species will migrate from the core datbases to the functional genomics databases, this includes Affymetrix, Illumina, Codelink and Phalanx
  • Schema change, an is_current filed is to be added to the coord_system table
Comparative genomics

Alignments – The new human assembly means that the following alignments will be regenerated:

  • 9 eutherian mammals EPO multiple alignments
  • 31 eutherian mammals EPO multiple alignments
  • 12 amniota vertbrates Pecan multiple alignments
  • 4 catarrhini primate EPO multiple alignments
  • Pairwise BLASTZ-NET alignments of human against each of the other 9 and 31 eutherian mammals
  • Additional pairwise BLASTZ-NET alignments will be run for human-opossum, human-platypus, human- chicken and human-wallaby
  • Translated BLAT-NET will be regenerated for human against fugu, X.tropicalis, C.intestinalis, C.savignyi, stickleback, medaka, chicken, zebrafish, tetraodon, zebrafinch and anole lizard

Synteny will be recalculated for: rat vs. huamn, chicken vs. human and human vs. macaque, dog, chimpanzee, platypus, opossum, mouse, orangutan, horse and cow

Homologies amd families

  • 50 way GeneTrees and homologies with new/updated genebuilds and assemblies
  • Clustering using hcluster_sg
  • Multiple Sequence Alignments using consistency-based MCoffee meta-aligner (mafftgins + muscle + kalign + probcons) and new exon-skipping aware “skipper” algorithm.
  • New ‘putative gene split’ and ‘distant paralog’ homology types
  • Pairwise gene-based dN/dS calculations for high coverage species pairs
  • Updated MCL families including all Ensembl transcript isoforms and newest Uniprot Metazoa
  • Multiple sequence alignments with MAFFT
  • Stable IDs for GeneTrees (ENSGT00550NNNNNNNNN) and MCL Families (ENSFM00550NNNNNNNNN).