Research output: Contribution to journal › Article › peer-review
HybridSPAdes : An algorithm for hybrid assembly of short and long reads. / Antipov, Dmitry; Korobeynikov, Anton; McLean, Jeffrey S.; Pevzner, Pavel A.
In: Bioinformatics, Vol. 32, No. 7, 01.04.2016, p. 1009-1015.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - HybridSPAdes
T2 - An algorithm for hybrid assembly of short and long reads
AU - Antipov, Dmitry
AU - Korobeynikov, Anton
AU - McLean, Jeffrey S.
AU - Pevzner, Pavel A.
N1 - Publisher Copyright: © 2015 The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
PY - 2016/4/1
Y1 - 2016/4/1
N2 - Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. Results: We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. Availability and implementation: hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades.
AB - Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. Results: We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. Availability and implementation: hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades.
UR - http://www.scopus.com/inward/record.url?scp=84964474556&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btv688
DO - 10.1093/bioinformatics/btv688
M3 - Article
C2 - 26589280
VL - 32
SP - 1009
EP - 1015
JO - Bioinformatics
JF - Bioinformatics
SN - 1367-4803
IS - 7
ER -
ID: 7954363