Find predicted CRISPR sites using Ensembl

The CRISPR/Cas9 system has revolutionised scientific research over the last few years, offering an efficient method of genome editing. CRISPR/Cas9 utilises the cellular machinery used by bacteria to recognise and edit the DNA of invading viruses. It is formed of two key components: Cas9, an enzyme that can cut a double DNA strand at a precise point; and CRISPR, a short strand of RNA that guides the Cas9 enzyme to recognise and cleave at specific DNA sites.

Cas9 restricts DNA at specific Protospacer Adjacent Motifs (PAMs), which is species-dependent (for example, 5′ NGG 3′ for Streptococcus pyogenes Cas9). Therefore, by coupling a custom CRISPR polymer (gRNA), Cas9’s restriction activity can be targeted to specific locations in the genome that contain a PAM region.

The latest release of Ensembl (Ensembl 85, July 2016) now includes annotated CRISPR/Cas9 sites predicted by the Wellcome Trust Sanger Institute Genome Editing (WGE) group for human and mouse genomes.

The WGE group have predicted CRISPR sites and developed an accompanying database to help you design genome editing experiments, and you can view these WGE-predicted sites by adding the ‘WGE CRISPR sites’ track to any ‘Region in Detail’ view for human or mouse in Ensembl. Click on the ‘Configure this page’ option from the menu on the left hand side of the page, and then add the track, which Configure this page buttoncan be found in the ‘Other regulatory regions’ category, by clicking the empty box and selecting the track style from the pop-up window:Add CRISPR track option

Below, you can see an example of the WGE-predicted CRISPR site track added (to both the forward and reverse strand) of the genomic region containing the human BRCA2 gene in the ‘structure’ style. Each CRISPR site is labelled as a single green box, which appears as a single vertical line when viewing a large genomic region.CRISPR site track

From the example above, we have now zoomed into a specific region of interest. You can see the structure of each CRISPR site, with the filled green box matching up with the PAM motif and the un-filled box representing the potential gRNA binding sequence. Clicking on any of these individual CRISPR sites will open a pop-up window that provides you with more information about the specific genomic co-ordinates of the CRISPR site as well as a link to the WGE database.CRISPR pop up

You can find more information about the CRISPR site prediction method in the published description of the WGE database

3 thoughts on “Find predicted CRISPR sites using Ensembl

  1. Hi Ben,

    This is a really cool new feature that I’m sure will be very popular. Just a small point – in the pop up window that comes up on clicking a CRISPR site, it lists the off-target sites, and every guide I’ve looked at has 1 ‘0_Mismatch_Off_Targets’. Looking in more detail, it seems like the intended target site is being picked up as an off-target site. Is there a way to fix this so it recognises that this ‘off-target’ is actually the target?

    Also, in configure this page, I’m not sure ‘regulation’ is a great place to put CRISPR, but looking at your other banners I don’t have any better suggestions.

    Great to see my most common use of ENSEMBL being integrated into the browser!

    Cheers,
    Mike

    • Hi Mike,

      Thanks for your feedback. Personally, I’m really excited to be able to include this information in Ensembl, as I think it is a fantastic resource for everyone.

      Since this is the first time we have incorporated this information into our web browser, so we are definitely keen to improve this feature in any way possible, so I will pass your suggestions onto our developers for consideration.

      Best Wishes

      Ben

  2. Hi Ben

    Thank you for your helpful feedback. We are very pleased that these CRISPR tracks from WGE are now available in Ensembl.

    You are right – the WGE scoring string does contain one on-target that we classify as a zero mismatch off-target. There is more information in the WGE help pages (see below). This on-target match is presented as a member of ‘0_Mismatch_Off_Targets’ in the Ensembl popup. We could simply reduce the count by 1 but then this number would signify different things in Ensembl and WGE, a confusion we wish to avoid. The popup menu ‘mismatch’ items are basically a reworked presentation of the WGE scoring summary string.

    While developing WGE, we thought hard about the presentation and potential use of the scoring summary string. The use and meaning of this string is documented in our publication in Bioinformatics [doi:10.1093/bioinformatics/btv308]. We don’t currently have plans to change the format of this string because of its use by groups at the Sanger Institute and many others around the world. We would prefer that the format shown in Ensembl reflects the form presented in WGE itself. However, we will keep the Ensembl presentation of the data from the scoring summary string under review following your comment.

    See the help page in WGE (http://www.sanger.ac.uk/htgt/wge/crispr_help#summary_explanation) for further details.

    We agree that regulation is not the right place for these tracks – we are discussing where best to place them with the Ensembl team.

    Thanks for taking the time to comment.

    David Parry-Smith
    Group Leader
    Stem Cell Informatics (the home of WGE)
    Wellcome Trust Sanger Institute