Ensembl 100

Known bugs in Ensembl

Missing Goldfish JSON file

Affects: Live site Versions: Ensembl 100
Description: The JSON file for the Goldfish (Carassius auratus) is not available on the Ensembl ftp site.
Workaround: The data that would have been in the JSON file is available in a range of other files.

Incomplete mapping in Assembly Converter

Affects: Live site, Mirrors Versions: Ensembl 98, Ensembl 99, Ensembl 100
Description: The following species have no mappings between new and old assemblies.
This means that the Assembly Converter tool will not present these, even though such mappings are possible.

Fungi: Saccharomyces cerevisiae (EF1 and R64-1-1)

Protists: Thalassiosira pseudonana (ASM14940v1_bd and ASM14940v2)

Workaround:

No workaround.

Inconsistencies between core and core-like dbs

Affects: Live site, Mirrors Versions: Ensembl 100
Description: The assembly name and accession do not match between the human rnaseq (GRCh38.p10, GCA_000001405.25), otherfeatures (GRCh38.p12, GCA_000001405.27) and core (GRCh38.p13, GCA_000001405.28) databases. The otherfeatures database has the expected GRCh38.p10 assembly and seq_region tables though.

The assembly name and accession do not match between the mouse rnaseq (GRCm38.p5, GCA_000001635.7) and core (GRCm38.p6, GCA_000001635.8) databases. The rnaseq database has the expected GRCm38.p6 assembly and seq_region tables though.

Workaround: No workaround.

Erroneous transcript (ENSRNASEQT00001237576) in Pig RNAseq data

Affects: Live site, Archives Versions: Ensembl 99, Ensembl 100
Description: There is a transcript in the Pig reference rnaseq database with no exons, and no translation. We will remove this transcript and hand over the corrected rnaseq database.
Workaround: No workaround.

Links to RefSeq genes do not work

Affects: Live site, Mirrors Versions: Ensembl 99, Ensembl 100
Description: Links to RefSeq genes in region views do not work, because they use an internal identifier rather than the gene ID.
Workaround: Links to transcripts are correct, so these can be used to navigate to the correct page on the NCBI website.

Drosophila melanogaster RNA gene cross-reference links do not work

Affects: Live site, Mirrors Versions: Ensembl 99, Ensembl 100
Description: Rfam and miRBase cross-reference links do not work, because they use the FlyBase ID instead of the RNA gene.
Workaround: Search for the Rfam or miRBase ID on the respective website.

Gene name cross-reference links do not work

Affects: Live site, Mirrors Versions: Ensembl 99, Ensembl 100
Description: Cross-reference links to HGNC, MGI, and ZFIN do not link to the correct page, because they use the name rather than the numeric identifier.
Workaround: Search for the gene name on the HGNC, MGI, or ZFIN website.

Remove semicolons [;] from gene names in dumped GTF files

Affects: Live site Versions: Ensembl 100
Description: The Arabidopsis thaliana GTF file available for download contains semicolons in the gene_name within the attributes. This is a disallowed character for many downstream programmes (for example htseq-count).

This does not affect the GFF version of the same file.

Workaround: Escape forbidden GTF characters, such as semicolons, within attributes.

Gene trees missing human ncRNA genes

Affects: Live site Versions: Ensembl 99, Ensembl 100
Description: Due to missing Rfam references, a number of human ncRNA genes have not been clustered correctly and are therefore missing from gene trees and homology predictions.
Workaround: No workaround. Use Ensembl 98 if possible.

Drosophilidae cores imported from FlyBase have stop codon missing from their CDSs

Affects: Live site Versions: Ensembl 98, Ensembl 99, Ensembl 100, Ensembl 101
Description: Drosophilidae cores imported from FlyBase have missing stop codon in their CDS. No proteins and their domains are affected.
Workaround: No workaround. The species core will be reimported from FlyBase in a future release.