  • 44,046 genomes (43,552 bacteria and 494 archaea) from 8244 species

Come learn about Fungal Pathogen genomics!

Our next annual advanced course on fungal pathogen genomics will be in May 2019. This course brings together bioinformaticians, biologists, clinicians and computer scientists from around the world wanting to use fungal data resources. Come join us! Application and bursary deadline is 7th February 2019. See details and registration.

Archive site

Find all the data from release 40 of EnsemblBacteria at (July 2018).

Ensembl Bacteria

Ensembl Bacteria is a browser for bacterial and archaeal genomes. These are taken from the databases of the International Nucleotide Sequence Database Collaboration(the European Nucleotide Archive at the EBI, GenBank at the NCBI, and the DNA Database of Japan).

Non-redundant genomes

The ENA houses over 90,000 prokaryotic genome assemblies, including multiple strains of many species. To reduce redundancy, we have adopted a policy (as of release 35 - April 2017) of only loading in new sequences that are relatively non-redundant with the existing data set, according to the criteria of the UniProt Knowledgebase (DOI: 10.1093/database/baw139). All strains that were present in the INSDC archives prior to this release have already been included in Ensembl Bacteria (regardless of whether they meet the new criteria) and will remain available in future.

Please note that from 2019 we will no longer be hosting redundant bacterial genomes. All bacterial genomes we currently have will continue to exist on an archive site after this date.

Data access

Data can be visualised through the Ensembl genome browser and accessed programmatically via our Perl and RESTful APIs. Data is also accessible through public MySQL databases and our FTP site containing full data dumps in FASTA, EMBL, GTF, GFF3, JSON and RDF formats.

There are no BioMarts currently available for Ensembl Bacteria,, but we are developing new, more powerful data mining tools. A selection of over 100 key bacterial genomes has been included in the pan-taxonomic Compara, and genes from all genomes have been classified into families using HAMAP and PANTHER (more details).

Ensembl Genomes is developed by EMBL-EBI and is powered by the Ensembl software system for the analysis and visualisation of genomic data. For details of our funding please click here.

