Ensembl API on Windows

Following on from previous installation guides, here we walk the seldom-trodden path to a Windows development environment for Ensembl. Linux and Mac OS users are well served by our Installation and Mac guides, so…

How do I install the Ensembl API on Windows?

Caveat: These methods have been tested on Windows 7 64-bit Home Premium

Method 1, “The easiest way” – Use the Ensembl VM

Ensembl builds a complete downloadable Virtual Machine image that provides everything you need to access Ensembl data. For this you need to install VirtualBox, following our guide. If you really struggle with Linux, you may find the virtual machine hard to use, but take a look at the next section before you give up.

ensembl_desktop

Method 2, “The native way” – Install many dependencies

By default, Windows lacks many of the development tools required to use the Ensembl API. It will take some time to get up and running.

Do What I Mean Perl

This bundle contains a full selection of libraries necessary for modern Perl development. There are other Perl distributions if you prefer, but this one is the most all-inclusive.

An editor

DWIMperl above comes with Padre, a pure Perl editor, but many prefer to write code in other software, such as:

An Archive/zip file extractor

Code and data are often shipped in compressed archive formats. You will want something to manage them, e.g. 7-Zip or the gzip and tar tools brought along with Cygwin.

Git (optional)

Ensembl is changing from CVS to Git and migrating to Github to host our development. After release 75, the use of any version of Ensembl besides the most recent release will probably want to use Git to retrieve the API.

Additional Perl libraries

DWIMperl has installed the excellent tool cpan-minus, which will assist you in installing the Perl libraries needed either by Ensembl, or your own scripts. If Perl declares a library is missing at any stage, cpanm can be used to install it without fuss.

BioPerl is required for Ensembl, but sadly the CPAN release is not Windows compatible. Therefore it will have to be downloaded manually. Once downloaded it can be unpacked to a directory (for example c:\Users\Me\src) and used.

Ensembl Source

The API itself can be downloaded from our FTP server or retrieved via GitHub. Unpack or git clone into a handy source directory alongside BioPerl.

Set up the environment

Perl needs to know where to find BioPerl and Ensembl. There are CPAN modules to help you manage this, but you can also do it directly in Windows.

Windows system-wide settings can be found in the Control Panel, see the screenshot below. Add the following to your User environment variables, making sure to include the whole path to your downloaded copies of BioPerl and Ensembl API.

PERL5LIB = src\bioperl-1.2.3;
src\ensembl\modules;
src\ensembl-compara\modules;
src\ensembl-variation\modules;
src\ensembl-funcgen\modules

settings

Getting stuff done

You can now launch Perl from within an editor, or through the command shell (windows-r command). A lesser known tool is PowerShell, (windows-r powershell) which is a little more potent than the standard command shell. Both support auto-complete on the tab key in any event.

Change directory to src\ensembl\misc-scripts and test the installation with perl ping_ensembl.pl
It will tell you if you have misconfigured any major components and establish contact with Ensembl databases.

You are now free to work with the Ensembl API as you wish. Good luck with your work!

Appendix: Working with Cygwin

Many developers on the Windows platform use Cygwin to add Unix-like capabilities to Windows. This includes tar, gzip, and an entire Perl installation, not to mention many more. You are free to achieve the same installation as outlined above using Cygwin to manage most of the components required, but will need to manually install many more dependencies. You might still be better off using DWIMperl above to get Perl support along with Cygwin tools.