Extending rnaSPAdes functionality for hybrid transcriptome assembly › Научные исследования в СПбГУ

DOI

https://doi.org/10.1186/s12859-020-03614-2
Конечная издательская версия
https://doi.org/10.1101/2020.01.24.918482
Конечная издательская версия

Andrey D. Prjibelski
Giuseppe D. Puglia
Dmitry Antipov
Elena Bushmanova
Daniela Giordano
Alla Mikheenko
Domenico Vitale
Alla Lapidus

Background: De novo RNA-Seq assembly is a powerful method for analysing transcriptomes when the reference genome is not available or poorly annotated. However, due to the short length of Illumina reads it is usually impossible to reconstruct complete sequences of complex genes and alternative isoforms. Recently emerged possibility to generate long RNA reads, such as PacBio and Oxford Nanopores, may dramatically improve the assembly quality, and thus the consecutive analysis. While reference-based tools for analysing long RNA reads were recently developed, there is no established pipeline for de novo assembly of such data. Results: In this work we present a novel method that allows to perform high-quality de novo transcriptome assemblies by combining accuracy and reliability of short reads with exon structure information carried out from long error-prone reads. The algorithm is designed by incorporating existing hybridSPAdes approach into rnaSPAdes pipeline and adapting it for transcriptomic data. Conclusion: To evaluate the benefit of using long RNA reads we selected several datasets containing both Illumina and Iso-seq or Oxford Nanopore Technologies (ONT) reads. Using an existing quality assessment software, we show that hybrid assemblies performed with rnaSPAdes contain more full-length genes and alternative isoforms comparing to the case when only short-read data is used.

Язык оригинала	английский
Номер статьи	302
Страницы (с-по)	302
Число страниц	9
Журнал	BMC Bioinformatics
Том	21
Номер выпуска	Suppl 12
DOI	https://doi.org/10.1186/s12859-020-03614-2 https://doi.org/10.1101/2020.01.24.918482
Состояние	Опубликовано - 24 июл 2020

Предметные области Scopus

Прикладная математика
Молекулярная биология
Структурная биология
Биохимия
Прикладные компьютерные науки

Области исследований

transcriptomics, transcriptome assembly, RNA-Seq, Oxford nanopores, Iso-seq, Hybrid assembly, De novo assembly

ID: 61160726

Pure – это продукт компании Elsevier
На данном информационном ресурсе могут быть опубликованы архивные материалы
с упоминанием физических и юридических лиц, включенных Министерством юстиции
Российской Федерации в реестр иностранных агентов

Вход в Pure