PolyPhen-2 and SIFT scores are changing in release 109

We are updating SIFT and PolyPhen-2 predictions of missense variant deleteriousness in the Ensembl browser and Ensembl VEP with the new release 109. We have recalculated all scores using newer software versions, updating PolyPhen-2 from 2.2.2 to 2.2.3 and SIFT from version 5.2.2 to 6.2.1. When we update software and reference data versions, we expect to see changes in some predictions. This is a guide as to what you can expect.

We ran SIFT 6.2.1 for all possible missense variants in the Ensembl proteins for highly accessed vertebrate species using the UniRef90 FASTA (2022_01) protein database. When comparing to previous results, we identified the following differences in predictions for human missense variants (e.g., 4.99% of variants predicted as deleterious by SIFT 5.2.2 are now considered tolerated by SIFT 6.2.1):

We calculated PolyPhen-2 2.2.3 scores for all human missense variants with the updated bundled datasets and PDB/DSSP structural databases for version 2.2.3 and identified the following changes in predictions:

Note that ‘Unknown’ is assigned when PolyPhen-2 cannot determine whether a variant is  benign or damaging, whereas ‘No prediction’ signifies no result was returned.

As a result of changes to the protein structure database used by PolyPhen-2, we have fewer predictions than previous Ensembl releases. Approximately 11,000 transcripts now lack PolyPhen-2 predictions, including nearly 1,500 MANE Select or MANE Clinical Plus transcripts.

If you have any questions regarding these changes, please feel free to contact us. Happy VEPing!