Standard

Minerva : An alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics. / Danko, David C.; Meleshko, Dmitry; Bezdan, Daniela; Mason, Christopher; Hajirasouliha, Iman.

In: Genome Research, Vol. 29, No. 1, 01.2019, p. 116-124.

Research output: Contribution to journalArticlepeer-review

Harvard

Danko, DC, Meleshko, D, Bezdan, D, Mason, C & Hajirasouliha, I 2019, 'Minerva: An alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics', Genome Research, vol. 29, no. 1, pp. 116-124. https://doi.org/10.1101/gr.235499.118

APA

Danko, D. C., Meleshko, D., Bezdan, D., Mason, C., & Hajirasouliha, I. (2019). Minerva: An alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics. Genome Research, 29(1), 116-124. https://doi.org/10.1101/gr.235499.118

Vancouver

Author

Danko, David C. ; Meleshko, Dmitry ; Bezdan, Daniela ; Mason, Christopher ; Hajirasouliha, Iman. / Minerva : An alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics. In: Genome Research. 2019 ; Vol. 29, No. 1. pp. 116-124.

BibTeX

@article{23b4a468e449447599d0d1c1c3187f38,
title = "Minerva: An alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics",
abstract = " Emerging Linked-Read technologies (aka read cloud or barcoded short-reads) have revived interest in short-read technology as a viable approach to understand large-scale structures in genomes and metagenomes. Linked-Read technologies, such as the 10x Chromium system, use a microfluidic system and a specialized set of 3 ′ barcodes (aka UIDs) to tag short DNA reads sourced from the same long fragment of DNA; subsequently, the tagged reads are sequenced on standard short-read platforms. This approach results in interesting compromises. Each long fragment of DNA is only sparsely covered by reads, no information about the ordering of reads from the same fragment is preserved, and 3 ′ barcodes match reads from roughly 2–20 long fragments of DNA. However, compared to long-read technologies, the cost per base to sequence is far lower, far less input DNA is required, and the per base error rate is that of Illumina short-reads. In this paper, we formally describe a particular algorithmic issue common to Linked-Read technology: the deconvolution of reads with a single 3 ′ barcode into clusters that represent single long fragments of DNA. We introduce Minerva, a graph-based algorithm that approximately solves the barcode deconvolution problem for metagenomic data (where reference genomes may be incomplete or unavailable). Additionally, we develop two demonstrations where the deconvolution of barcoded reads improves downstream results, improving the specificity of taxonomic assignments and of k-mer-based clustering. To the best of our knowledge, we are the first to address the problem of barcode deconvolution in metagenomics. ",
author = "Danko, {David C.} and Dmitry Meleshko and Daniela Bezdan and Christopher Mason and Iman Hajirasouliha",
year = "2019",
month = jan,
doi = "10.1101/gr.235499.118",
language = "English",
volume = "29",
pages = "116--124",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory ",
number = "1",

}

RIS

TY - JOUR

T1 - Minerva

T2 - An alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics

AU - Danko, David C.

AU - Meleshko, Dmitry

AU - Bezdan, Daniela

AU - Mason, Christopher

AU - Hajirasouliha, Iman

PY - 2019/1

Y1 - 2019/1

N2 - Emerging Linked-Read technologies (aka read cloud or barcoded short-reads) have revived interest in short-read technology as a viable approach to understand large-scale structures in genomes and metagenomes. Linked-Read technologies, such as the 10x Chromium system, use a microfluidic system and a specialized set of 3 ′ barcodes (aka UIDs) to tag short DNA reads sourced from the same long fragment of DNA; subsequently, the tagged reads are sequenced on standard short-read platforms. This approach results in interesting compromises. Each long fragment of DNA is only sparsely covered by reads, no information about the ordering of reads from the same fragment is preserved, and 3 ′ barcodes match reads from roughly 2–20 long fragments of DNA. However, compared to long-read technologies, the cost per base to sequence is far lower, far less input DNA is required, and the per base error rate is that of Illumina short-reads. In this paper, we formally describe a particular algorithmic issue common to Linked-Read technology: the deconvolution of reads with a single 3 ′ barcode into clusters that represent single long fragments of DNA. We introduce Minerva, a graph-based algorithm that approximately solves the barcode deconvolution problem for metagenomic data (where reference genomes may be incomplete or unavailable). Additionally, we develop two demonstrations where the deconvolution of barcoded reads improves downstream results, improving the specificity of taxonomic assignments and of k-mer-based clustering. To the best of our knowledge, we are the first to address the problem of barcode deconvolution in metagenomics.

AB - Emerging Linked-Read technologies (aka read cloud or barcoded short-reads) have revived interest in short-read technology as a viable approach to understand large-scale structures in genomes and metagenomes. Linked-Read technologies, such as the 10x Chromium system, use a microfluidic system and a specialized set of 3 ′ barcodes (aka UIDs) to tag short DNA reads sourced from the same long fragment of DNA; subsequently, the tagged reads are sequenced on standard short-read platforms. This approach results in interesting compromises. Each long fragment of DNA is only sparsely covered by reads, no information about the ordering of reads from the same fragment is preserved, and 3 ′ barcodes match reads from roughly 2–20 long fragments of DNA. However, compared to long-read technologies, the cost per base to sequence is far lower, far less input DNA is required, and the per base error rate is that of Illumina short-reads. In this paper, we formally describe a particular algorithmic issue common to Linked-Read technology: the deconvolution of reads with a single 3 ′ barcode into clusters that represent single long fragments of DNA. We introduce Minerva, a graph-based algorithm that approximately solves the barcode deconvolution problem for metagenomic data (where reference genomes may be incomplete or unavailable). Additionally, we develop two demonstrations where the deconvolution of barcoded reads improves downstream results, improving the specificity of taxonomic assignments and of k-mer-based clustering. To the best of our knowledge, we are the first to address the problem of barcode deconvolution in metagenomics.

UR - http://www.scopus.com/inward/record.url?scp=85059499996&partnerID=8YFLogxK

U2 - 10.1101/gr.235499.118

DO - 10.1101/gr.235499.118

M3 - Article

C2 - 30523036

AN - SCOPUS:85059499996

VL - 29

SP - 116

EP - 124

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 1

ER -

ID: 62371077