The Variant Recoder, available on the Ensembl REST API, can help you with data re-use. Multiple identifiers, coming from different databases, can refer to the same variant – the Variant Recoder can help.Continue reading
Variants can be represented in myriad different ways; indeed, Ensembl VEP currently supports input in many different formats, including VCF, HGVS and SPDI. However, even within these specifications, variants can be described ambiguously. Insertions and deletions within repeated regions can be described at multiple different locations. For example, VCF describes variants using their most 5’ representation, while HGVS format describes a variant at its most 3’ location.
Starting in Ensembl 100, VEP optionally normalises variants within repeated regions by shifting them as far as possible in the 3’ direction before consequence calculation. This standardises VEP output for equivalent variant alleles which are described using different conventions.
NCBI has announced big changes to how dbSNP manages human variation data, which will be reflected in Ensembl. These changes include a new allele normalisation approach and the removal of some older population genetics data.
Ensembl 98 (and Ensembl Genomes 45) are due out next month, so it’s time to pig-out on the tasty morsels we have to offer. As with all releases, we cannot guarantee that anything listed here will make it into the final release.