The brand new Ensembl Regulatory Build on the new GRCh38 human assembly has been released in Ensembl 76. This involved a complete redesign of the build process with a new statistically rigorous logic, a streamlining all the backend processes and a remapping and peak calling of all our data sets.
The build and constituent data are available to view directly through Ensembl through the Regulation section of “Configure This Page”, but we have also made it accessible by creating a public track hub. A track hub is a pre-configured set of tracks which you can load together into Ensembl or other genome browsers such as the UCSC. In addition to data loaded into Ensembl, it also contains tracks that summarise the data used to generate the build
What’s in the Ensembl Regulatory Build Track Hub?
You’ve been meaning to do that data remapping a while now, but didn’t quite get to it? Or maybe you’re just feeling nostalgic? No worries, the track hub covers both GRCh37 and 38.
For each transcription factor, we calculate the probability of having binding at any position, based on the available data sets by simply dividing the number of overlapping peaks by the number of data sets. These probabilities can be viewed in the TFBS Summaries section. An overall probability of any binding is viewable in the TFBS Summary track of the Ensembl build overview section.
We use genome segmentation software (Segway), to partition the genome into regions of similar signal over these assays, and label these states as e.g. predicted promoters, enhancers or repressed. The segmentations for each cell type can be found in the Cell Type Segmentations tracks.
For each state of the segmentation, we also create a summary track which represents the number of cell types that have that state at any given base pair of the genome.
The Ensembl Regulatory Build
The summarised Ensembl Regulatory Build can be viewed in the “Ensembl Reg. Build” track of the Ensembl Build Overview section. For each cell type, we then annotate each feature as on or off, as displayed in the Cell Type Activity tracks.