The interpretation of non-coding variants is more challenging than that of coding variants as less prediction methods and reference data are available. On top of the annotation provided for human and mouse in the Ensembl Regulatory Build, the Ensembl Variant Effect Predictor (VEP) also integrates two other human-specific datasets providing information about how variants can affect gene expression. The plugins, satMutMPRA and FunMotifs, are available for use with command-line VEP. One provides detailed information on the impact on expression of variants in the regulatory regions of disease-associated genes; the other an alternative set of genome-wide transcription factor binding motifs.
By default VEP will tell you the consequences for every transcript affected by a variant. You may wish to prioritise your analysis to only the most important or well supported transcripts for each gene, and VEP provides information to help you do that.
With all the fuss we make about our resources for human genomes, you might think the VEP was just for human; it’s not. We have really useful resources, like SIFT, phenotypes and caches for loads of other species in Ensembl.
Ensembl 97 and Ensembl Genomes 44 have been released! In this release you’ll find many new species, including some hybrid livestock, as well as important changes to gene sets for human and mouse and a new update to the human Regulatory Build.
Read on to explore the full details.
Interpreting a single variant can be a lot more involved than just finding out its consequence. Sometimes to understand a variant, you need to know exactly where it falls, which exon, which amino acid, sometimes even which base in the codon. The VEP gives you all of this by default.
If you’re really delving into the role of a particular genetic variant, you might want to know about that base position in other species. VEP can get you ancestral alleles in human and conservation scores in many species for a variant position allowing you to assess if a position is evolutionarily important, or if an allele matches our primate ancestors.
If you’re trying to work out which variants are associated with a phenotype or disease, a major thing you might want to know is if someone else has already spotted it. And if not the variant, maybe the gene that it hits. You can get that through the VEP.
We’re fortunate to be part of the EMBL European Bioinformatics Institute (EBI), which puts us alongside stellar bioinformaticians and resources in every discipline. From this, great collaborations can grow. We’ve already worked with our colleagues at Gene Expression Atlas and Reactome to embed widgets in Ensembl for viewing baseline gene expression and biochemical pathways respectively, but our latest collaboration is with the Protein Data Bank in Europe (PDBe) to show genetic variation on protein structures.
The number of genes and transcripts we have in Ensembl can make your VEP results very big. Filtering your results after running the VEP is the best way to make this more manageable, but you can also reduce the results in your run itself, to only get one result per variant or variant/gene combo.
The VEP can work as an offline or a web tool and it’s also available as REST service. Perfect for integrating into pipelines or displaying data on the web, the REST API VEP endpoints can take input as HGVS, genomic loci or variant identifiers and can interpret common forms of non-standard HGVS. They are all available using both GET and POST protocols, supporting queries on single or multiple variants respectively.