PathRacer: Racing Profile HMM Paths on Assembly Graph

Research outputpeer-review

Abstract

Recently large databases containing profile Hidden Markov Models (pHMMs) emerged. These pHMMs may represent the sequences of antibiotic resistance genes, or allelic variations amongst highly conserved housekeeping genes used for strain typing, etc. The typical application of such a database includes the alignment of contigs to pHMM hoping that the sequence of gene of interest is located within the single contig. Such a condition is often violated for metagenomes preventing the effective use of such databases. We present PathRacer—a novel standalone tool that aligns profile HMM directly to the assembly graph (performing the codon translation on fly for amino acid pHMMs). The tool provides the set of most probable paths traversed by a HMM through the whole assembly graph, regardless whether the sequence of interested is encoded on the single contig or scattered across the set of edges, therefore significantly improving the recovery of sequences of interest even from fragmented metagenome assemblies.

Original languageEnglish
Title of host publication6th International Conference on Algorithms for Computational Biology
EditorsMiguel A. Vega-Rodríguez, Ian Holmes, Carlos Martín-Vide
PublisherSpringer
Pages80-94
Number of pages15
ISBN (Print)9783030181734
DOIs
Publication statusPublished - 25 May 2019
Event6th International Conference on Algorithms for Computational Biology, AlCoB 2019 - Berkeley
Duration: 28 May 201930 May 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11488 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Conference on Algorithms for Computational Biology, AlCoB 2019
CountryUnited States
CityBerkeley
Period28/05/1930/05/19

Fingerprint

Hidden Markov models
Markov Model
Path
Genes
Graph in graph theory
Gene
Antibiotics
Amino acids
Probable
Amino Acids
Alignment
Recovery
Profile

Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Shlemov, A., & Korobeynikov, A. (2019). PathRacer: Racing Profile HMM Paths on Assembly Graph. In M. A. Vega-Rodríguez, I. Holmes, & C. Martín-Vide (Eds.), 6th International Conference on Algorithms for Computational Biology (pp. 80-94). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11488 LNBI). Springer. https://doi.org/10.1007/978-3-030-18174-1_6
Shlemov, Alexander ; Korobeynikov, Anton. / PathRacer : Racing Profile HMM Paths on Assembly Graph. 6th International Conference on Algorithms for Computational Biology. editor / Miguel A. Vega-Rodríguez ; Ian Holmes ; Carlos Martín-Vide. Springer, 2019. pp. 80-94 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{9ee265fb536d45bb8ebabb3e286d42d1,
title = "PathRacer: Racing Profile HMM Paths on Assembly Graph",
abstract = "Recently large databases containing profile Hidden Markov Models (pHMMs) emerged. These pHMMs may represent the sequences of antibiotic resistance genes, or allelic variations amongst highly conserved housekeeping genes used for strain typing, etc. The typical application of such a database includes the alignment of contigs to pHMM hoping that the sequence of gene of interest is located within the single contig. Such a condition is often violated for metagenomes preventing the effective use of such databases. We present PathRacer—a novel standalone tool that aligns profile HMM directly to the assembly graph (performing the codon translation on fly for amino acid pHMMs). The tool provides the set of most probable paths traversed by a HMM through the whole assembly graph, regardless whether the sequence of interested is encoded on the single contig or scattered across the set of edges, therefore significantly improving the recovery of sequences of interest even from fragmented metagenome assemblies.",
keywords = "Graph alignment, Profile HMM, Set of most probable paths",
author = "Alexander Shlemov and Anton Korobeynikov",
year = "2019",
month = "5",
day = "25",
doi = "10.1007/978-3-030-18174-1_6",
language = "English",
isbn = "9783030181734",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "80--94",
editor = "Vega-Rodr{\'i}guez, {Miguel A.} and Ian Holmes and Carlos Mart{\'i}n-Vide",
booktitle = "6th International Conference on Algorithms for Computational Biology",
address = "Germany",

}

Shlemov, A & Korobeynikov, A 2019, PathRacer: Racing Profile HMM Paths on Assembly Graph. in MA Vega-Rodríguez, I Holmes & C Martín-Vide (eds), 6th International Conference on Algorithms for Computational Biology. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11488 LNBI, Springer, pp. 80-94, Berkeley, 28/05/19. https://doi.org/10.1007/978-3-030-18174-1_6

PathRacer : Racing Profile HMM Paths on Assembly Graph. / Shlemov, Alexander; Korobeynikov, Anton.

6th International Conference on Algorithms for Computational Biology. ed. / Miguel A. Vega-Rodríguez; Ian Holmes; Carlos Martín-Vide. Springer, 2019. p. 80-94 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11488 LNBI).

Research outputpeer-review

TY - GEN

T1 - PathRacer

T2 - Racing Profile HMM Paths on Assembly Graph

AU - Shlemov, Alexander

AU - Korobeynikov, Anton

PY - 2019/5/25

Y1 - 2019/5/25

N2 - Recently large databases containing profile Hidden Markov Models (pHMMs) emerged. These pHMMs may represent the sequences of antibiotic resistance genes, or allelic variations amongst highly conserved housekeeping genes used for strain typing, etc. The typical application of such a database includes the alignment of contigs to pHMM hoping that the sequence of gene of interest is located within the single contig. Such a condition is often violated for metagenomes preventing the effective use of such databases. We present PathRacer—a novel standalone tool that aligns profile HMM directly to the assembly graph (performing the codon translation on fly for amino acid pHMMs). The tool provides the set of most probable paths traversed by a HMM through the whole assembly graph, regardless whether the sequence of interested is encoded on the single contig or scattered across the set of edges, therefore significantly improving the recovery of sequences of interest even from fragmented metagenome assemblies.

AB - Recently large databases containing profile Hidden Markov Models (pHMMs) emerged. These pHMMs may represent the sequences of antibiotic resistance genes, or allelic variations amongst highly conserved housekeeping genes used for strain typing, etc. The typical application of such a database includes the alignment of contigs to pHMM hoping that the sequence of gene of interest is located within the single contig. Such a condition is often violated for metagenomes preventing the effective use of such databases. We present PathRacer—a novel standalone tool that aligns profile HMM directly to the assembly graph (performing the codon translation on fly for amino acid pHMMs). The tool provides the set of most probable paths traversed by a HMM through the whole assembly graph, regardless whether the sequence of interested is encoded on the single contig or scattered across the set of edges, therefore significantly improving the recovery of sequences of interest even from fragmented metagenome assemblies.

KW - Graph alignment

KW - Profile HMM

KW - Set of most probable paths

UR - http://www.scopus.com/inward/record.url?scp=85066111513&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/pathracer-racing-profile-hmm-paths-assembly-graph

U2 - 10.1007/978-3-030-18174-1_6

DO - 10.1007/978-3-030-18174-1_6

M3 - Conference contribution

AN - SCOPUS:85066111513

SN - 9783030181734

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 80

EP - 94

BT - 6th International Conference on Algorithms for Computational Biology

A2 - Vega-Rodríguez, Miguel A.

A2 - Holmes, Ian

A2 - Martín-Vide, Carlos

PB - Springer

ER -

Shlemov A, Korobeynikov A. PathRacer: Racing Profile HMM Paths on Assembly Graph. In Vega-Rodríguez MA, Holmes I, Martín-Vide C, editors, 6th International Conference on Algorithms for Computational Biology. Springer. 2019. p. 80-94. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-18174-1_6