We’re fortunate to be part of the EMBL European Bioinformatics Institute (EBI), which puts us alongside stellar bioinformaticians and resources in every discipline. From this, great collaborations can grow. We’ve already worked with our colleagues at Gene Expression Atlas and Reactome to embed widgets in Ensembl for viewing baseline gene expression and biochemical pathways respectively, but our latest collaboration is with the Protein Data Bank in Europe (PDBe) to show genetic variation on protein structures.
These releases are huge in many respects, so it was difficult to decide which news to put first! Let’s start with some exciting news from our annotators.Continue reading
The number of genes and transcripts we have in Ensembl can make your VEP results very big. Filtering your results after running the VEP is the best way to make this more manageable, but you can also reduce the results in your run itself, to only get one result per variant or variant/gene combo.
We will make changes to the directory layouts of both the Ensembl Genomes FTP server (ftp://ftp.ensemblgenomes.org/pub/) and the Ensembl GRCh37 FTP server (ftp://ftp.ensemblorg.ebi.ac.uk/pub/grch37/) that may affect your pipelines. These changes will come into effect in Ensembl Genomes release 43/Ensembl release 96, which are scheduled for April 2019. Here are the details, so that you can plan any required updates to existing scripts and pipelines ahead of the releases.
As the community’s capacity for genome sequencing expands, so do its ambitions. Recently, many exciting global genomics projects have been launched, including the Vertebrate Genomes Project (VGP), Darwin Tree of Life (DToL), Earth Biogenome Project EBP, i5K (insects) and 10KP (plants). Between them, they aim to sequence the genomes of every eukaryote on Earth, and Ensembl are excited to take on the annotation of some of those genomes.
Joannella Morales, Jane Loveland and Adam Frankish contributed to this post.
Back in October, we introduced you to our new joint initiative with the NCBI — the Matched Annotation from the NCBI and EMBL-EBI (MANE) transcript set. We are now pleased to update you on our progress so far.
The goal of this project is to share annotation and converge on a high-confidence, genome-wide transcript set, with a matched transcript in both RefSeq and Ensembl/GENCODE. We are doing this in two phases. During phase 1, we will release the “MANE Select” transcript set to include one well-supported transcript for every protein-coding locus. We envision the adoption of the MANE Select set as a default set across genomics resources. In phase 2, we intend to release an expanded set (“MANE Plus”) to include additional transcripts per locus that are well-supported or of particular user interest.
In the next release of Ensembl (Ensembl 96) we will remove our ontology database patching scripts from the main Ensembl repository.
There is now a dedicated module using the EBI OLS service to load Ensembl required ontologies. Considering this module is now in charge of loading the required data, the previous databases patches have been moved to the ols-ensembl-loader repository.
If you need to update your system with future patches, please now refer to the ols-ensembl-loader repository sql directory where files are already available.
Please contact the Ensembl Helpdesk if you have any questions or want to find out more about how this might affect your work.
Today we are meeting Guy, who works in the Plants team of Ensembl Genomes. He talks about how he came to Ensembl, his interests and experiences so far.