BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs

Dmitry Meleshko, Hosein Mohimani, Vittorio Tracanna, Iman Hajirasouliha, Marnix H. Medema, Anton Korobeynikov, Pavel A. Pevzner

Результат исследований: Научные публикации в периодических изданияхстатья

Выдержка

Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets.

Язык оригиналаанглийский
Страницы (с-по)1352-1362
ЖурналGenome Research
Том29
Номер выпуска8
DOI
СостояниеОпубликовано - 1 авг 2019

Отпечаток

Multigene Family
Metagenomics
Genome
Microbial Genome
Firearms
Biological Products
Anti-Bacterial Agents

Предметные области Scopus

  • Генетика
  • Генетика (клиническая)

Цитировать

Meleshko, D., Mohimani, H., Tracanna, V., Hajirasouliha, I., Medema, M. H., Korobeynikov, A., & Pevzner, P. A. (2019). BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs. Genome Research, 29(8), 1352-1362. https://doi.org/10.1101/gr.243477.118
Meleshko, Dmitry ; Mohimani, Hosein ; Tracanna, Vittorio ; Hajirasouliha, Iman ; Medema, Marnix H. ; Korobeynikov, Anton ; Pevzner, Pavel A. / BiosyntheticSPAdes : reconstructing biosynthetic gene clusters from assembly graphs. В: Genome Research. 2019 ; Том 29, № 8. стр. 1352-1362.
@article{16149e3d5e294f058a8c176d4271a77f,
title = "BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs",
abstract = "Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets.",
author = "Dmitry Meleshko and Hosein Mohimani and Vittorio Tracanna and Iman Hajirasouliha and Medema, {Marnix H.} and Anton Korobeynikov and Pevzner, {Pavel A.}",
year = "2019",
month = "8",
day = "1",
doi = "10.1101/gr.243477.118",
language = "English",
volume = "29",
pages = "1352--1362",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "8",

}

Meleshko, D, Mohimani, H, Tracanna, V, Hajirasouliha, I, Medema, MH, Korobeynikov, A & Pevzner, PA 2019, 'BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs', Genome Research, том. 29, № 8, стр. 1352-1362. https://doi.org/10.1101/gr.243477.118

BiosyntheticSPAdes : reconstructing biosynthetic gene clusters from assembly graphs. / Meleshko, Dmitry; Mohimani, Hosein; Tracanna, Vittorio; Hajirasouliha, Iman; Medema, Marnix H.; Korobeynikov, Anton; Pevzner, Pavel A.

В: Genome Research, Том 29, № 8, 01.08.2019, стр. 1352-1362.

Результат исследований: Научные публикации в периодических изданияхстатья

TY - JOUR

T1 - BiosyntheticSPAdes

T2 - reconstructing biosynthetic gene clusters from assembly graphs

AU - Meleshko, Dmitry

AU - Mohimani, Hosein

AU - Tracanna, Vittorio

AU - Hajirasouliha, Iman

AU - Medema, Marnix H.

AU - Korobeynikov, Anton

AU - Pevzner, Pavel A.

PY - 2019/8/1

Y1 - 2019/8/1

N2 - Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets.

AB - Predicting biosynthetic gene clusters (BGCs) is critically important for discovery of antibiotics and other natural products. While BGC prediction from complete genomes is a well-studied problem, predicting BGCs in fragmented genomic assemblies remains challenging. The existing BGC prediction tools often assume that each BGC is encoded within a single contig in the genome assembly, a condition that is violated for most sequenced microbial genomes where BGCs are often scattered through several contigs, making it difficult to reconstruct them. The situation is even more severe in shotgun metagenomics, where the contigs are often short, and the existing tools fail to predict a large fraction of long BGCs. While it is difficult to assemble BGCs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding long BGCs. We describe biosyntheticSPAdes, a tool for predicting BGCs in assembly graphs and demonstrate that it greatly improves the reconstruction of BGCs from genomic and metagenomics data sets.

UR - http://www.scopus.com/inward/record.url?scp=85071055943&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/biosyntheticspades-reconstructing-biosynthetic-gene-clusters-assembly-graphs

U2 - 10.1101/gr.243477.118

DO - 10.1101/gr.243477.118

M3 - Article

C2 - 31160374

AN - SCOPUS:85071055943

VL - 29

SP - 1352

EP - 1362

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 8

ER -