EMBL-EBI User Survey 2024

Do data resources managed by EMBL-EBI and our collaborators make a difference to your work?

Please take 10 minutes to fill in our annual user survey, and help us make the case for why sustaining open data resources is critical for life sciences research.

Survey link: https://www.surveymonkey.com/r/HJKYKTT?channel=[webpage]

Gene families

Gene families are sets of proteins that have been clustered based on sequence similarity. In Ensembl Genomes, these are used to provide a way of exploring similar proteins across a wide range of bacterial genomes for which the standard peptide comparative pipeline cannot be run. Gene families are displayed in the web interface or can be accessed using the Ensembl Compara Perl API.

In Ensembl Bacteria, gene families are populated with proteins on all bacterial genomes by using the HAMAP and PANTHER classification provided by InterPro. Note that while this uses the same database schema and API as Ensembl, it does not use the same gene family pipeline. Gene families are also not available for any other Ensembl Genomes division.