Search for a gene
Search for a genome
e.g. type esc to find Escherichia
What's New in Release 40
- 44,046 genomes (43,552 bacteria and 494 archaea) from 8244 species
- No significant updates since the last release
Ensembl Bacteria is a browser for bacterial and archaeal genomes. These are taken from the databases of the International Nucleotide Sequence Database Collaboration(the European Nucleotide Archive at the EBI, GenBank at the NCBI, and the DNA Database of Japan).
The ENA houses over 90,000 prokaryotic genome assemblies, including multiple strains of many species. To reduce redundancy, we have adopted a policy (as of release 35 - April 2017) of only loading in new sequences that are relatively non-redundant with the existing data set, according to the criteria of the UniProt Knowledgebase (DOI: 10.1093/database/baw139). All strains that were present in the INSDC archives prior to this release have already been included in Ensembl Bacteria (regardless of whether they meet the new criteria) and will remain available in future.
Please note that from 2019 we will no longer be hosting redundant bacterial genomes. All bacterial genomes we currently have will continue to exist on an archive site after this date.
Data can be visualised through the Ensembl genome browser and accessed programmatically via our Perl and RESTful APIs. Data is also accessible through public MySQL databases and our FTP site containing full data dumps in FASTA, EMBL, GTF, GFF3, JSON and RDF formats.
There are no BioMarts currently available for Ensembl Bacteria,, but we are developing new, more powerful data mining tools. A selection of over 100 key bacterial genomes has been included in the pan-taxonomic Compara, and genes from all genomes have been classified into families using HAMAP and PANTHER (more details).
New Ensembl Genomes Archive Sites
Ensembl Genomes now has archive sites for all divisions. These can be found at the following URLs:
The archive sites will allow researchers to access data from old releases via our web-based tools, and additionally will be able to display track hubs containing alignments and features located on older versions of genome assemblies that have since been upgraded in the live site. Archive sites will be searchable and BioMarts will be available where they were produced for the site when live. Schema and API versions for archive sites will be the same as when the data was released, i.e. archive sites will not be updated to use the most recent versions. Ensembl tools will not be active in the initial release, but we are hoping to enable these shortly; likewise, archival REST servers will not initially be available, but will be added in future. Major bugs (i.e. those impeding the usability of the site) will be fixed, but minor bugs will not be.
The first release of the archive sites contains content from release 37. New archive sites will be released at least once a year, under URLs indicating the date of first release of the data they contain.The previously existing archive for Ensembl Plants, http://archive.plants.ensembl.org, will continue to be available at this URL, but also as http://mar2016-plants.ensembl.org, in accordance with the new naming scheme. As previously, data from all recent releases will continue to be available for download at ftp://ftp.ensemblgenomes.org.