Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование
Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies. / Mc Cartney, Ann M.; Shafin, Kishwar; Alonge, Michael; Bzikadze, Andrey V.; Formenti, Giulio; Fungtammasan, Arkarachai; Howe, Kerstin; Jain, Chirag; Koren, Sergey; Logsdon, Glennis A.; Miga, Karen H.; Mikheenko, Alla; Paten, Benedict; Shumate, Alaina; Soto, Daniela C.; Sović, Ivan; Wood, Jonathan M.D.; Zook, Justin M.; Phillippy, Adam M.; Rhie, Arang.
в: Nature Methods, Том 19, № 6, 06.2022, стр. 687-695.Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование
}
TY - JOUR
T1 - Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies
AU - Mc Cartney, Ann M.
AU - Shafin, Kishwar
AU - Alonge, Michael
AU - Bzikadze, Andrey V.
AU - Formenti, Giulio
AU - Fungtammasan, Arkarachai
AU - Howe, Kerstin
AU - Jain, Chirag
AU - Koren, Sergey
AU - Logsdon, Glennis A.
AU - Miga, Karen H.
AU - Mikheenko, Alla
AU - Paten, Benedict
AU - Shumate, Alaina
AU - Soto, Daniela C.
AU - Sović, Ivan
AU - Wood, Jonathan M.D.
AU - Zook, Justin M.
AU - Phillippy, Adam M.
AU - Rhie, Arang
N1 - Mc Cartney, A.M., Shafin, K., Alonge, M. et al. Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies. Nat Methods (2022). https://doi.org/10.1038/s41592-022-01440-3
PY - 2022/6
Y1 - 2022/6
N2 - Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.
AB - Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.
KW - Female
KW - Genome, Human
KW - High-Throughput Nucleotide Sequencing/methods
KW - Humans
KW - Nanopores
KW - Pregnancy
KW - Sequence Analysis, DNA/methods
KW - Telomere/genetics
UR - http://www.scopus.com/inward/record.url?scp=85127481319&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/bb4a1c9f-c9b3-3ff2-8982-8e1774c8edfe/
U2 - 10.1038/s41592-022-01440-3
DO - 10.1038/s41592-022-01440-3
M3 - Article
C2 - 35361931
AN - SCOPUS:85127481319
VL - 19
SP - 687
EP - 695
JO - Nature Methods
JF - Nature Methods
SN - 1548-7091
IS - 6
ER -
ID: 94683479