Peptidic natural products (PNPs) include many antibiotics and other bioactive compounds. While the recent launch of the Global Natural Products Social (GNPS) molecular networking infrastructure is transforming PNP discovery into a high-throughput technology, PNP identification algorithms are needed to realize the potential of the GNPS project. GNPS relies on the assump- tion that each connected component of a molecular network (representing related metabolites) illuminates the ‘dark matter of metabolomics’ as long as it contains a known metabolite present in a database. We reveal a surprising diversity of PNPs produced by related bacteria and show that, contrary to the ‘comparative metabolomics’ assumption, two related bacteria are unlikely to produce identical PNPs (even though they are likely to produce similar PNPs). Since this observation undermines the utility of GNPS, we developed a PNP identification tool, VarQuest, that illuminates the connected components in a molecular network even if they do not contain known PNPs and only contain their variants. VarQuest reveals an order of magnitude more PNP variants than all previous PNP discovery efforts and demonstrates that GNPS already contains spectra from 41% of the currently known PNP families. The enormous diversity of PNPs suggests that biosynthetic gene clusters in various microorgan- isms constantly evolve to generate a unique spectrum of PNP variants that differ from PNPs in other species.
Original languageEnglish
Pages (from-to)319-327
Number of pages9
JournalNature Microbiology
Volume3
Issue number3
DOIs
StatePublished - 1 Mar 2018

    Scopus subject areas

  • Applied Microbiology and Biotechnology
  • Microbiology (medical)
  • Genetics
  • Cell Biology
  • Microbiology
  • Immunology

    Research areas

  • Computational biology and bioinformatics, Metabolomics, Natural products, Mass Spectrometry, NONRIBOSOMAL PEPTIDES, SEQUENCES, DEREPLICATION, MOLECULAR NETWORKING, POSTTRANSLATIONAL MODIFICATIONS, BACTERIAL, IDENTIFICATION, DISCOVERY, BIOSYNTHETIC GENE-CLUSTER, SPECTROMETRY

ID: 33133694