There are still a few more Ensembl training events before the end of the year.

Browser workshops:

UNAM, Mexico City, Mexico (1-2 Dec)
UNAM, Cuernavaca, Mexico (5 Dec) (+ departmental seminar 4 Dec)

Amsterdam, The Netherlands (19 Dec)

Developers workshop:

University of Cambridge, UK (1-3 Dec)

In addition, Ensembl will feature as part of the following courses:

Wellcome Trust Open Door Workshop ‘Working with the Human Genome Sequence’ (1-2 Dec, Hinxton, Cambridge, UK) and Genes en evolución, ecologia e conservación (8-9 Dec, La Paz, Baja California, Mexico)

Do you know a bit of Perl? Ensembl hosts an API (Application Programmers Interface) which uses Object-Oriented Perl to extract data from Ensembl databases. This API is public and can be used for people to programmatically access the data in the Ensembl database. We understand that not everyone is used to Object-Oriented code, although people may have basic Perl skills and be interested in using our datasets. For that kind of bioinformaticist, I would recommend a recent short read in O’Reilly’s Broadcast:

Beginners Introduction to Object-Oriented Programming with Perl – O’Reilly Broadcast

And for the more advanced readers, the classic reference book in OO-Perl would be Damian Conway’s Object Oriented Perl, which a part from being very informative, has a really cool cover 🙂

We are always trying to lower the barrier to entry for research communities interested in using the Ensembl database in programmatic ways that make use of all the complexity associated with the generation of our data. That’s why our API is public and well-documented. You can learn about our API by attending on of our API workshops for free (e.g.: 1-3 December – Univ. Cambridge, UK). We are currently trying to smooth things out even more, working on ways to make it even easier to download all that’s needed to use the API and have the example scripts running in your computer with the minimum number of steps. Keep tuned for news in this respect soon…

Ensembl is attending the 22nd International Mammalian Genome Conference taking place in Prague (Czech Republic) from 2-5 November 2008. This meeting starts with three bioinformatics workshops on Sunday (2nd November) at the Institute of Molecular Genetics (AS CR). One of these workshops will focus on Ensembl, discussing new developments and featuring a preview of our new interface. We will be starting a 9:00 (Seminar Room 3.102, in the third floor). You can download the workshop materials here (exercises and tutorials). As part of our commitment with the EURATools consortium, we’ll be focusing on rat genomics, but if you work with any other species annotated in Ensembl, you are welcome.
See you in Prague!

Only 3 continents to cover this time, but November will be even busier for the Ensembl trainers than October

Ensembl will feature as part of the following courses:

‘Computational & Comparative Genomics’ (5-11 Nov, Cold Spring Harbor Laboratory, New York, US)
Wellcome Trust Open Door Workshop ‘Working with the Human Genome Sequence (10-12 Nov, Wellcome Trust Genome Campus, Hinxton, UK)
Hands-on training at EBI ‘Programmatic access in Java: webservices & work flows’ (24-27 Nov, Wellcome Trust Genome Campus, Hinxton, UK)

Browser workshops will be given at the following locations:

Prague, Czech Republic (2 Nov)
Madrid, Spain (5 Nov)
Newcastle, UK (13 Nov)
Cambridge, UK (13-14 Nov)
Naples, Italy (19 Nov)

North America:
Cambridge, Massachusetts, US (12 Nov & 14 Nov)
Boston, Massachusetts, US (13 Nov)

Kuala Lumpur, Malaysia (24-25 Nov)
Sabah, Malaysia (27-28 Nov)

Ensembl is currently down due to a power outage in the data centre at the Sanger Institute last night. Power has been restored, but it will take some time to restore all of the services.

We are working to get things up and running and expect that Ensembl will be back mid to late morning UK time.

New design
You will already have seen a number of emails about the upcoming Ensembl 51 release – the web team are working hard to tidy up the loose ends of the release! We have got most of the major views ready, and just working on some of the views you may have never found before. As a taster I’m posting a few screen shots from our development site, the first shows the new page layout for graphical display of genomic regions (the old contigview). You will see many of the new design decisions in this screen shot:

  • There are more views per object as we have broken up the large single pages into smaller components;
  • Tabs for the different focus objects – in this case Gene and Location. Transcript and Variation feature are the other tabs available;
  • A tree of all information available about the focus feature on the left hand side;
  • Left/right pagination buttons to allow you to navigate between all the information we have about the focus object.
  • “General” and “local” tools areas

Under the hood!

There have been a large number of changes under the hood of the web-site. Notable changes have been:

  • Use of modified version of memcached to store and retrieve cached images, static and dynamic content, user settings;
  • Re-writing the configuration code to automagically detect the contents of the databases and try and display the content appropriately;
  • Breaking up of the component code into separate modules;
  • Removing the need for a script per view – by using “routeing” style URL parsing to work out what objects are to be rendered and how… e.g. /Gene/Compara_Tree/Text displays the text version of a gene’s homology tree.
  • More and easier to configure renderers for drawing code.
  • A strive for standards compliance in both XHTML and CSS; which should allow us to support more easily modern web browsers. We will be actively supporting Firefox 3+, Internet Explorer 7+ and Safari 3+ (and other similar browsers), while trying to make sure that the site is still workable in other browsers (at the site appears to work in Opera 9.25+)

New configuration panel

All configuration of the site and individual views has been moved to a common “Configuration dialog” box.

  • The old “yellow menus” are replaced by a more expansive and easier to navigate tree of features. Important now there are nearly 200 individual tracks in the Human Location view page.
  • There are more choices to display some tracks – rather than just turning them on and off, you can decide how you wish them to be displayed.
  • Configuration for other pages are loaded in a similar way.
  • The site has a common site-wide image width setting.
  • The configuration panel is also where you will: manage your accounts, upload data, attach DAS and URL based data

Different renderers

For different data types we now support different renderers – not just collapsed and expanded.
For example:

  • For genomic alignments we support, the ungrouped features (all on one line), normal grouped and bumped features at both full and half-height, and now also “stacked” features – “2 pixel” high glyphs.

We hope when you see the new interface that you will find it more intuitive, more discoverable and faster to use and most importantly more productive for the research work that you are doing.

The Ensembl team has been involved in several activities in Hyderabad (India) during the last few days, making the most of the latest HUGO’s 13th Human Genome (HGM2008).

A satellite workshop has been organised within the Open Door Workshop framework at the Centre for Cellular and Molecular Biology (CCMB). Over 40 scientists from different countries had the opportunity to learn about different resources freely available on the Internet, providing us with useful feedback.

Following our presence in the HGM2008 in the EBI booth we had the opportunity to make several contacts that hopefully should allow us to organise a series of workshops around India next year. If you were interested to know more about this, or query about possibilities to host one of our workshops, you can contact us.

Greetings from India भारत से नमस्ते

As usual October is a busy month for the Ensembl trainers with workshops on 4(!) different continents.

From 1-3 Oct Ensembl will feature in the Wellcome Trust Open Door Workshop “Working with the Human Genome Sequence” in Hyderabad, India, and from 6-8 Oct in the EBI hands-on workshop “A two-day dip into the EBI’s data resources: Understanding your data” in Hinxton, UK.

Upcoming browser workshops:
9-10 Oct: J. Craig Venter Institute, Rockville, MD, US
14 Oct: National Human Genome Research Institute (NHGRI), Bethesda, MD, US
15 Oct: National Human Genome Research Institute (NHGRI), Bethesda, MD, US
16-17 Oct: University of the Free State, Bloemfontein, South Africa
20-21 Oct: University of the Witwatersrand, Johannesburg, South Africa
22 Oct: University of Nottingham, Nottingham, UK
23-24 Oct: University of the Western Cape, Cape Town, South Africa
29-30 Oct: EBI Roadshow, Dublin, Ireland

Considering hosting an Ensembl workshop yourself? Please contact Xose Fernandez.

Steve posted the news that we’re delaying our new release for at least two more weeks. The message is pasted in here:

Hi all

In our Intentions Summary mail for release 51 we stated that the release was scheduled for early/mid September. The 51 release will include significant updates and improvements to the web interface. We are delaying release while we complete development on these. We are working to get the release out as soon as possible, and are now aiming for end September/early October. I apologise for this delay.



Dr Steve Searle
Ensembl Project Leader, Sanger

It is always so frustrating to delay, but of course, far more important to have a working site than something only part working. Welcome to delivering high end services.

We took on alot of things to change in this web refresh. For most users the main thing people will notice is the entirely new web layout. This was driven by our surveys of users who mainly complained about being buried in too many displays and data. We then took around 4 months working with user groups and trialling different layouts (many thanks for those who participated) which in some cases made significant changes to our original designs (we now have a hybrid “tab and left-hand-side” approach, voted as best by ~60% of people, with the other three options splitting the rest of vote). We’re very excited about this new layout going live as it just looks cleaner, less cluttered and yet providing more information. The other thing people will notice is that it is just faster. As the saying goes, you can’t be too rich, too thin or have your websites go too fast.

Making a website go faster is harder than it might look. It involves all sorts of things – the bandwidth of your machines to us, the speed the servers, the connectivity of servers to databases, the speed of the API, the database to disk, the management of the huge number of simultaneous users we have and then the size of the html returned and finally the render speed on your browser. All of these contribute to the overall perception of “speed”. Under the hood we’ve been working on all these aspects – internally a big change is that we have switched from needing a common file system for our web farm to work off. Previously when your browser asks for a contigview page, our servers generates html with an image and that image is written to the common disk, the browser parses the image tag, asks for this image – and this is the critical bit – sends a request which in all likelihood will be served by a different server in our webfarm. That server then went to the common file system to pick up the file and send it back. Many times a critical bottleneck has been read/write on this shared filesystem. In the new system this has all gone, and the images are stored in a memory-based common store, meaning both that we remove this bottle-neck (which will be the first big effect) and secondly we will be able to cache alot more – the hope is that many of the identical pictures for the common species will be entirely served from memory in the new system. Another important change has been aggressively sliming our html. Currently all sorts of files – often very small – are pinged by each page up, just to see if they have changed. We’ve consolidated alot of these files – and compressed them – and then also optimised them for render speed.

There is a variety of things not for this release but coming up end of 2008/early 2009 also on speed. Our API has a new concept, collections, which better handles the case of zoomed out views, where we know the renders will not be able to render every object. Instead a collection – which may be rendered as a union or density or something will be provided. The other thing on the horizon is us setting up a US mirror on the west coast. For the last year we have been extensively monitoring the speed of Ensembl from different sites, and there is a large increase in time to retrieve on the north-west coast of the US. We’ve been investigating quite why this (and learning lots more about the backbone of the internet than we knew before) but it seems as if the simplest way to getting speed to work in the west coast is to just run a mirror over there. Probably 2009 for that to go live.

Back to the website. It looks so much better – and has much better hardware characteristics – (our shared file system is … well … rather 2004 technology and needs pretty constant care at the moment) that I can’t wait until it comes out. But there is absolutely no point in having a crippled site in functionality even though we’ve got many of the user interface and technical issues right. The sticking point at the moment is the configuration panel. This comes up as “modal” box on top of the page, allowing alot of options to choose from, but not a bewildering set of options on each page. To cope with the 200 odd different tracks to switch on and off, the box has to have tabs and friendly, browseable hieriarchies. To get all this to work in a nice, friendly, slick way… that’s alot of Javascript.

And alot of Javascript is alot of browser compatible headaches. Even using JS libraries – prototype and scriptolicious (I think – James smith can tell you the details!) there are all sorts of details that might not work just-quite the same way on IE5 compared to IE6. Or Firefox. Or Safari. And it must degrade at least functionally without JS. And of course work, and render fast. This modal box is the last, complex thing to get sorted.

We’re close. I’ve seen the box come up over James’ screen. I hear Steve has seen it come and tracks change, and see the link of tracks to changes. The API for the configuration system was gutted and is much better. But its got to work on all main browsers. For all our genomes, in particular Human and Mouse. And this is just tricky, fiddly work.

We’re not quite there yet. We’re really close, and so much is working it is just excruitiating. But we need another couple of weeks. James is being shielded from other jobs by Steve and others; Eugene is torture testing memcachedb to stress test the system before it goes live; Xose, Bert and Guilietta are writing help; Beth and Anne are writing the additional pagelets inside of the new geneview and transcriptviews. and it all looks really good.

So – apologies – we thought we’d be launching in July. We thought we’d be launching in September. We still might just do that, but then again, it might well be October. If it goes any later I will have no hair.

But it does look really good.

It is definitely worth the wait. Like Guinness.


After the Summer break we are getting up to speed again with our training events:

14-16 Sep: Ensembl User Meeting, Hinxton, UK

17-19 Sep: Browser workshops and presentations, Erasmus MC Molecular Medicine Postgraduate School, Rotterdam, The Netherlands

22 Sep: Browser workshop, VIB Flanders Interuniversity Institute of Biotechnology, Antwerp, Belgium

We also have a complete list of all upcoming training events for the coming months available. Are we not coming to a location close to you? Why not host then an Ensembl workshop yourself? For more details, please contact Xose Fernandez.