Standard

SPAligner : Alignment of long diverged molecular sequences to assembly graphs. / Dvorkina, Tatiana; Antipov, Dmitry; Korobeynikov, Anton; Nurk, Sergey.

In: BMC Bioinformatics, Vol. 21, No. Suppl 12, 306, 24.07.2020.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Author

BibTeX

@article{6aa7576b4448476e9d81fcc2850ed305,
title = "SPAligner: Alignment of long diverged molecular sequences to assembly graphs",
abstract = "Background: Graph-based representation of genome assemblies has been recently used in different contexts - from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts - in particular the analysis of metagenomic sequencing data. Results: In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets. Conclusions: Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github. ",
keywords = "Assembly graph, Graph alignment, Molecular sequences alignment, Genetic Variation, Sequence Alignment, Algorithms, Base Sequence, Humans, Software, Statistics as Topic, Haplotypes/genetics, beta-Lactamases/chemistry",
author = "Tatiana Dvorkina and Dmitry Antipov and Anton Korobeynikov and Sergey Nurk",
note = "Funding Information: Publication of this supplement is funded by the Russian Science Foundation (grant 19-14-00172). Research was carried out in part by computational resources provided by Resource Center “Computer Center of SPbU”. The authors are grateful to Saint Petersburg State University for the overall support of this work (project id: 51555639). Funders had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript. Publisher Copyright: {\textcopyright} 2020 The Author(s). Copyright: Copyright 2020 Elsevier B.V., All rights reserved.; 3rd International Conference on Bioinformatics - From Algorithms to Applications (BiATA) ; Conference date: 20-06-2019 Through 22-06-2019",
year = "2020",
month = jul,
day = "24",
doi = "10.1186/s12859-020-03590-7",
language = "English",
volume = "21",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central Ltd.",
number = "Suppl 12",

}

RIS

TY - JOUR

T1 - SPAligner

T2 - 3rd International Conference on Bioinformatics - From Algorithms to Applications (BiATA)

AU - Dvorkina, Tatiana

AU - Antipov, Dmitry

AU - Korobeynikov, Anton

AU - Nurk, Sergey

N1 - Funding Information: Publication of this supplement is funded by the Russian Science Foundation (grant 19-14-00172). Research was carried out in part by computational resources provided by Resource Center “Computer Center of SPbU”. The authors are grateful to Saint Petersburg State University for the overall support of this work (project id: 51555639). Funders had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript. Publisher Copyright: © 2020 The Author(s). Copyright: Copyright 2020 Elsevier B.V., All rights reserved.

PY - 2020/7/24

Y1 - 2020/7/24

N2 - Background: Graph-based representation of genome assemblies has been recently used in different contexts - from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts - in particular the analysis of metagenomic sequencing data. Results: In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets. Conclusions: Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github.

AB - Background: Graph-based representation of genome assemblies has been recently used in different contexts - from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts - in particular the analysis of metagenomic sequencing data. Results: In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets. Conclusions: Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github.

KW - Assembly graph

KW - Graph alignment

KW - Molecular sequences alignment

KW - Genetic Variation

KW - Sequence Alignment

KW - Algorithms

KW - Base Sequence

KW - Humans

KW - Software

KW - Statistics as Topic

KW - Haplotypes/genetics

KW - beta-Lactamases/chemistry

UR - http://www.scopus.com/inward/record.url?scp=85088520108&partnerID=8YFLogxK

U2 - 10.1186/s12859-020-03590-7

DO - 10.1186/s12859-020-03590-7

M3 - Article

C2 - 32703258

VL - 21

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - Suppl 12

M1 - 306

Y2 - 20 June 2019 through 22 June 2019

ER -

ID: 49272157