Search
Browse
Statistics
Feeds

proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes

[thumbnail of Original Article]
Preview
PDF (Original Article) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
3MB

Item Type:Article
Title:proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes
Creators: Fullam, A. ORCID logoORCID: https://orcid.org/0000-0002-0884-8124, Letunic, I. ORCID logoORCID: https://orcid.org/0000-0003-3560-4288, Schmidt, T.S.B. ORCID logoORCID: https://orcid.org/0000-0001-8587-4177, Ducarmon, Q.R. ORCID logoORCID: https://orcid.org/0000-0001-7077-2127, Karcher, N. ORCID logoORCID: https://orcid.org/0000-0001-7894-8182, Khedkar, S. ORCID logoORCID: https://orcid.org/0000-0001-6606-2202, Kuhn, M. ORCID logoORCID: https://orcid.org/0000-0002-2841-872X, Larralde, M. ORCID logoORCID: https://orcid.org/0000-0002-3947-4444, Maistrenko, O.M. ORCID logoORCID: https://orcid.org/0000-0003-1961-7548, Malfertheiner, L. ORCID logoORCID: https://orcid.org/0000-0002-5697-2007, Milanese, A. ORCID logoORCID: https://orcid.org/0000-0002-7050-2239, Rodrigues, J.F.M. ORCID logoORCID: https://orcid.org/0000-0001-8413-9920, Sanchis-López, C. ORCID logoORCID: https://orcid.org/0000-0002-8206-1565, Schudoma, C. ORCID logoORCID: https://orcid.org/0000-0003-1157-1354, Szklarczyk, D. ORCID logoORCID: https://orcid.org/0000-0002-4052-5069, Sunagawa, S. ORCID logoORCID: https://orcid.org/0000-0003-3065-0314, Zeller, G. ORCID logoORCID: https://orcid.org/0000-0003-1429-7485, Huerta-Cepas, J. ORCID logoORCID: https://orcid.org/0000-0003-4195-5025, von Mering, C. ORCID logoORCID: https://orcid.org/0000-0001-7734-9102, Bork, P. ORCID logoORCID: https://orcid.org/0000-0002-2627-833X and Mende, D.R. ORCID logoORCID: https://orcid.org/0000-0001-6831-4557
Abstract:The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/.
Keywords:Factual Databases, Genetic Databases, Genome, Genomics, Molecular Sequence Annotation, Prokaryotic Cells
Source:Nucleic Acids Research
ISSN:0305-1048
Publisher:Oxford University Press
Volume:51
Number:D1
Page Range:D760-D766
Date:6 January 2023
Official Publication:https://doi.org/10.1093/nar/gkac1078
PubMed:View item in PubMed

Repository Staff Only: item control page

Downloads

Downloads per month over past year

Open Access
MDC Library