HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads

Dmitry Antipov, Anton Korobeynikov, Jeffrey S. McLean, Pavel A. Pevzner

Research output

88 Citations (Scopus)

Abstract

Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.Results: We describe HYBRIDSPADES algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that HYBRIDSPADES generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.
Original languageEnglish
Number of pages7
JournalBioinformatics
Volume32
Issue number7
DOIs
Publication statusPublished - 2016

Fingerprint

Nanopores
Genome
Technology
Costs and Cost Analysis
Benchmarking
Coverage
Sequencing
Genes
Real-time
Nanopore
Molecules
Costs
Hybrid Approach
Inaccurate
Benchmark
Cell
Demonstrate

Cite this

Antipov, Dmitry ; Korobeynikov, Anton ; McLean, Jeffrey S. ; Pevzner, Pavel A. / HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads. In: Bioinformatics. 2016 ; Vol. 32, No. 7.
@article{c7a42d5005134e219060b46ba5bd26dd,
title = "HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads",
abstract = "Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.Results: We describe HYBRIDSPADES algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that HYBRIDSPADES generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.",
author = "Dmitry Antipov and Anton Korobeynikov and McLean, {Jeffrey S.} and Pevzner, {Pavel A.}",
year = "2016",
doi = "10.1093/bioinformatics/btv688",
language = "English",
volume = "32",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "7",

}

HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads. / Antipov, Dmitry; Korobeynikov, Anton; McLean, Jeffrey S.; Pevzner, Pavel A.

In: Bioinformatics, Vol. 32, No. 7, 2016.

Research output

TY - JOUR

T1 - HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads

AU - Antipov, Dmitry

AU - Korobeynikov, Anton

AU - McLean, Jeffrey S.

AU - Pevzner, Pavel A.

PY - 2016

Y1 - 2016

N2 - Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.Results: We describe HYBRIDSPADES algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that HYBRIDSPADES generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.

AB - Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.Results: We describe HYBRIDSPADES algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that HYBRIDSPADES generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.

U2 - 10.1093/bioinformatics/btv688

DO - 10.1093/bioinformatics/btv688

M3 - Article

VL - 32

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 7

ER -