New features in the Variant Effect Predictor

The Variant Effect Predictor (VEP) software can predict the consequence of genomic variants using the genomic annotations provided by Ensembl. In release 63 of Ensembl we have added new features to both the script and web versions of the VEP.

Regulatory consequences have made their return; the VEP now reports if a variant falls within a regulatory region or a transcription factor binding motif, and furthermore if the variant falls in a high information locus within the motif.

The VEP now also has a dedicated area of the Ensembl website documentation.

Script version

To improve performance for users in the USA, we have now deployed a mirror of the public database server; to use this simply pass the flag “–host” when running the script.

We have also implemented a caching system in the VEP, such that is possible to use almost all of the functionality of the script without the script querying the database at all. Simply download and unpack a pre-built cache, run the script with the flag “–cache”, and hey presto! No more network dependencies.

We have now made “whole genome mode” the default run mode of the script – this code has been rewritten and optimized such that it should be suitable for all use cases. We’ve also improved the status output of the script as it runs, so users with lots of data can easily track their progress.

See the new documentation for further details on all of these new features, or just download the script!

Web version

It is now possible to filter your input variants by their frequency as observed in the 1000 genomes or HapMap populations. You can either include or exclude input variants that are co-located with existing variants, based on frequencies in any particular population or across a range of populations.

As before, you can access the web VEP through the tools page, or via the “Manage your data” link on any species-specific page.