The quickest and most flexible way to run Ensembl VEP on large scale variant sets is to install and run it locally. Depending on the use case, the underlying architecture, and the available permissions, installing VEP prerequisites is prone to complications. To aid with this, we have made the VEP available with containers.
Containers have recently gained popularity in software development and bioinformatics. Containers can be used to package together dependencies to allow software to run consistently on any underlying architecture. Docker is the most popular tool in the market to containerise applications for software deployment. A Docker container for VEP is currently available, however due to root requirements for the Docker daemon, this option is not always available to HPC users.
Singularity, an alternative containerisation tool, does not assume that you have a system where you are the root user. This has led to increased popularity in HPC settings, due to increased access rights flexibility.
Using VEP with Singularity is possible with the VEP Docker image.
You can download and run VEP with Singularity 3.5+ with the following commands:
singularity pull --name vep.sif docker://ensemblorg/ensembl-vep singularity exec vep.sif /opt/vep/src/ensembl-vep/vep
If you would like to set up a directory on your host machine for VEP to use as a cache directory:
mkdir $DATA_DIR/vep_data chmod a+rwx $DATA_DIR/vep_data
You can then run the VEP installer as normal. For example, you can install human GRCh38 cache and fasta files, alongside the dbNSFP, CADD and G2P plugins with the following command:
singularity exec -B $DATA_DIR/vep_data:/opt/vep/.vep vep.sif perl /opt/vep/src/ensembl-vep/INSTALL.pl -a cfp -s homo_sapiens -y GRCh38 -g dbNSFP,CADD,G2P