If you’re trying to work out which variants are associated with a phenotype or disease, a major thing you might want to know is if someone else has already spotted it. And if not the variant, maybe the gene that it hits. You can get that through the VEP.
For some years we have made our database admin interface publicly available via the Ensembl “public-plugins” repository, allowing you to edit certain fields in your core databases via a web form. However we are now using an alternative interface developed by our production team (written in Python), and will therefore be retiring the old plugin in release 97 (scheduled for June 2019).
If you are currently using the plugin and would like to know more about migrating your project to the new code, please contact us and we’ll try to help!
The RefSeq column on our gene pages has changed.
We’re moving towards a more unified gene-set with RefSeq, with biologically important transcripts being highlighted as MANE. This means displays you’re used to seeing will be updated to reflect these changes, and this may affect the way you have been working with Ensembl.
We’re fortunate to be part of the EMBL European Bioinformatics Institute (EBI), which puts us alongside stellar bioinformaticians and resources in every discipline. From this, great collaborations can grow. We’ve already worked with our colleagues at Gene Expression Atlas and Reactome to embed widgets in Ensembl for viewing baseline gene expression and biochemical pathways respectively, but our latest collaboration is with the Protein Data Bank in Europe (PDBe) to show genetic variation on protein structures.
These releases are huge in many respects, so it was difficult to decide which news to put first! Let’s start with some exciting news from our annotators.Continue reading
The number of genes and transcripts we have in Ensembl can make your VEP results very big. Filtering your results after running the VEP is the best way to make this more manageable, but you can also reduce the results in your run itself, to only get one result per variant or variant/gene combo.
We will make changes to the directory layouts of both the Ensembl Genomes FTP server (ftp://ftp.ensemblgenomes.org/pub/) and the Ensembl GRCh37 FTP server (ftp://ftp.ensemblorg.ebi.ac.uk/pub/grch37/) that may affect your pipelines. These changes will come into effect in Ensembl Genomes release 43/Ensembl release 96, which are scheduled for April 2019. Here are the details, so that you can plan any required updates to existing scripts and pipelines ahead of the releases.
As the community’s capacity for genome sequencing expands, so do its ambitions. Recently, many exciting global genomics projects have been launched, including the Vertebrate Genomes Project (VGP), Darwin Tree of Life (DToL), Earth Biogenome Project EBP, i5K (insects) and 10KP (plants). Between them, they aim to sequence the genomes of every eukaryote on Earth, and Ensembl are excited to take on the annotation of some of those genomes.
Joannella Morales, Jane Loveland and Adam Frankish contributed to this post.
Back in October, we introduced you to our new joint initiative with the NCBI — the Matched Annotation from the NCBI and EMBL-EBI (MANE) transcript set. We are now pleased to update you on our progress so far.
The goal of this project is to share annotation and converge on a high-confidence, genome-wide transcript set, with a matched transcript in both RefSeq and Ensembl/GENCODE. We are doing this in two phases. During phase 1, we will release the “MANE Select” transcript set to include one well-supported transcript for every protein-coding locus. We envision the adoption of the MANE Select set as a default set across genomics resources. In phase 2, we intend to release an expanded set (“MANE Plus”) to include additional transcripts per locus that are well-supported or of particular user interest.