Future Plans

The following updates are planned for upcoming releases of Ensembl.

Please note that we have no fixed timeline for most of these items

Gene annotation

  • Genebuilds in progress:
  • Upcoming genebuilds:
  • Please note: We are in the process of moving our gene annotation methods into the eHive system. We’ll write a blog post about this in 2017
  • Please note: Since January 2017, we have moved to a new system of gene annotation which we call ‘clade annotation’. This means that we plan to annotate a batch of closely related assemblies at the same time, and then move on to the next batch. We are annotating rodents as our first clade annotation project, to be followed by primates. Over the next two years, we plan to annotate other clades such as: fishes, birds/reptiles, other mammals, as well as the more phylogenetially distant chordates and amphibians.
  • Ensembl release 89 (expected May 2017)
    • Several new rodent assemblies
  • Ensembl release 90 (expected after July 2017)
    • Pig Sscrofa11
    • Several new rodent assemblies
  • Regular updates
    • Minor assembly updates for human and mouse:  incorporation of new alternate sequence provided by the GRC, with basic gene annotation.
    • Planned updates to human, mouse, rat and zebrafish gene sets:  incorporation of HAVANA manual annotation. For mouse, the gene set is updated every release. For human and zebrafish, the gene sets are updated every second release.

Comparative Genomics

  • Incorporate an HMM-based classification of protein sequences for the Protein-Trees pipeline
  • Improved detection of partial / split genes

Variation updates

  • Continue to import new variation data from dbSNP and DGVa where available.
  • Improve variation annotation using publicly available variant, phenotype and disease data.
  • Continue to import genome wide association study phenotypes for variants from the EBI-NHGRI Catalog, and variants and phenotypes from OMIM, Orphanet, OMIA and other sources.
  • Include phenotype data for structural variants.

Core API and schema

  • Switchable adaptors to serve data from sources other than MySQL databases
  • Megabase sized feature density tracks
  • Support for cigar and vulgar alignments
  • More efficient external reference assignment pipeline
  • FTP web tool for customisable file download
  • Transcript archive to retrieve sequence for retired features
  • TrackHub registry server

Regulation

  • Integrate more cell types (Roadmap Epigenomics, HipSci…)
  • Integrate more TFBS PWMs (e.g. SELEX, UniProbe…)
  • Attach regulatory elements to genes via eQTLs, chromatin conformation data, etc.
  • Development of DNA methylation tracks i.e. high level summaries and differentially and variably methylated regions
  • Annotate epigenomic markers of phenotype or differentiation
  • Web display developments:
    • Further refinements of wiggle track config/display including track highlighting
    • MotifFeature view incorporating variation consequences
  • Incorporating ChIP-seq data from further species for possible additional regulatory builds.
  • Investigate regulatory feature orthologs and/or comparative views

New web features

  • Complete the rework of Export / Download functionality
  • Redesign Protein Summary View
  • Extend Genoverse to support TrackHubs and uploaded user data

Biomart

  • Investigate ways to improve scalability and retrievability of the data from the various marts.
  • Continue to incorporate new filters and attributes to the marts as new data is added to the Ensembl schemas.