The Ensembl Rapid Release genome browser now provides homology prediction for every available annotated genome.
To make homologue predictions in the Rapid Release genome browser, we have used Diamond to identify the closest homologues between a query genome and a set of genome representatives. The closest homologues are defined by the Reciprocal Best Hits between two genomes or the Best hits when no reciprocal best hit is available.
The sets of genome representatives contain 39 genomes chosen to maximise the diversity in a given clade, for their functional annotation quality and community usage. Each representative set includes a shared subset with nine reference genomes spread across the eukaryotic tree;selected for their importance as model organisms and quality of annotation.
Representative sets have been defined for the following phyla:
- actinopterygii (bony fish),
- sauropsida (including birds and reptiles),
- hexapoda (including insects)
In addition, we have defined a default representative set used when no corresponding representative set is defined for a query genome. The default representative set is an extension of the shared reference genome set and includes mostly important model organisms found in each division of the eukaryotic tree of life. We are planning to develop new representative sets as new clades become well represented.
Homology data can be found on the Rapid Release genome browser by clicking on ‘Homologues’ in the left hand menu of the gene tab for each species.
We are now working on expanding the homology predictions in Ensembl Rapid Release with gene tree and orthologue/paralogue relationship predictions.