Standard

CentromereArchitect : Inference and analysis of the architecture of centromeres. / Dvorkina, Tatiana; Kunyavskaya, Olga; Bzikadze, Andrey V.; Alexandrov, Ivan; Pevzner, Pavel A.

In: Bioinformatics, Vol. 37, 01.07.2021, p. 196-204.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Author

Dvorkina, Tatiana ; Kunyavskaya, Olga ; Bzikadze, Andrey V. ; Alexandrov, Ivan ; Pevzner, Pavel A. / CentromereArchitect : Inference and analysis of the architecture of centromeres. In: Bioinformatics. 2021 ; Vol. 37. pp. 196-204.

BibTeX

@article{a177d0651ff94b7eb27210323ec86c8e,
title = "CentromereArchitect: Inference and analysis of the architecture of centromeres",
abstract = "Motivation: Recent advances in long-read sequencing technologies led to rapid progress in centromere assembly in the last year and, for the first time, opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. However, since these advances have not been yet accompanied by the development of the centromere-specific bioinformatics algorithms, even the fundamental questions (e.g. centromere annotation by deriving the complete set of human monomers and high-order repeats), let alone more complex questions (e.g. explaining how monomers and high-order repeats evolved) about human centromeres remain open. Moreover, even though there was a four-decade-long series of studies aimed at cataloging all human monomers and high-order repeats, the rigorous algorithmic definitions of these concepts are still lacking. Thus, the development of a centromere annotation tool is a prerequisite for follow-up personalized biomedical studies of centromeres across the human population and evolutionary studies of centromeres across various species. Results: We describe the CentromereArchitect, the first tool for the centromere annotation in a newly sequenced genome, apply it to the recently generated complete assembly of a human genome by the Telomere-to-Telomere consortium, generate the complete set of human monomers and high-order repeats for 'live' centromeres, and reveal a vast set of hybrid monomers that may represent the focal points of centromere evolution.",
keywords = "Algorithms, Base Sequence, Centromere/genetics, Genome, Humans, Telomere, ALPHA-SATELLITE DNA, REPEAT, ANNOTATION, OLD, SEQUENCE",
author = "Tatiana Dvorkina and Olga Kunyavskaya and Bzikadze, {Andrey V.} and Ivan Alexandrov and Pevzner, {Pavel A.}",
note = "Publisher Copyright: {\textcopyright} 2021 The Author(s) 2021. Published by Oxford University Press.",
year = "2021",
month = jul,
day = "1",
doi = "10.1093/bioinformatics/btab265",
language = "English",
volume = "37",
pages = "196--204",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",

}

RIS

TY - JOUR

T1 - CentromereArchitect

T2 - Inference and analysis of the architecture of centromeres

AU - Dvorkina, Tatiana

AU - Kunyavskaya, Olga

AU - Bzikadze, Andrey V.

AU - Alexandrov, Ivan

AU - Pevzner, Pavel A.

N1 - Publisher Copyright: © 2021 The Author(s) 2021. Published by Oxford University Press.

PY - 2021/7/1

Y1 - 2021/7/1

N2 - Motivation: Recent advances in long-read sequencing technologies led to rapid progress in centromere assembly in the last year and, for the first time, opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. However, since these advances have not been yet accompanied by the development of the centromere-specific bioinformatics algorithms, even the fundamental questions (e.g. centromere annotation by deriving the complete set of human monomers and high-order repeats), let alone more complex questions (e.g. explaining how monomers and high-order repeats evolved) about human centromeres remain open. Moreover, even though there was a four-decade-long series of studies aimed at cataloging all human monomers and high-order repeats, the rigorous algorithmic definitions of these concepts are still lacking. Thus, the development of a centromere annotation tool is a prerequisite for follow-up personalized biomedical studies of centromeres across the human population and evolutionary studies of centromeres across various species. Results: We describe the CentromereArchitect, the first tool for the centromere annotation in a newly sequenced genome, apply it to the recently generated complete assembly of a human genome by the Telomere-to-Telomere consortium, generate the complete set of human monomers and high-order repeats for 'live' centromeres, and reveal a vast set of hybrid monomers that may represent the focal points of centromere evolution.

AB - Motivation: Recent advances in long-read sequencing technologies led to rapid progress in centromere assembly in the last year and, for the first time, opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. However, since these advances have not been yet accompanied by the development of the centromere-specific bioinformatics algorithms, even the fundamental questions (e.g. centromere annotation by deriving the complete set of human monomers and high-order repeats), let alone more complex questions (e.g. explaining how monomers and high-order repeats evolved) about human centromeres remain open. Moreover, even though there was a four-decade-long series of studies aimed at cataloging all human monomers and high-order repeats, the rigorous algorithmic definitions of these concepts are still lacking. Thus, the development of a centromere annotation tool is a prerequisite for follow-up personalized biomedical studies of centromeres across the human population and evolutionary studies of centromeres across various species. Results: We describe the CentromereArchitect, the first tool for the centromere annotation in a newly sequenced genome, apply it to the recently generated complete assembly of a human genome by the Telomere-to-Telomere consortium, generate the complete set of human monomers and high-order repeats for 'live' centromeres, and reveal a vast set of hybrid monomers that may represent the focal points of centromere evolution.

KW - Algorithms

KW - Base Sequence

KW - Centromere/genetics

KW - Genome

KW - Humans

KW - Telomere

KW - ALPHA-SATELLITE DNA

KW - REPEAT

KW - ANNOTATION

KW - OLD

KW - SEQUENCE

UR - http://www.scopus.com/inward/record.url?scp=85111438021&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btab265

DO - 10.1093/bioinformatics/btab265

M3 - Article

C2 - 34252949

AN - SCOPUS:85111438021

VL - 37

SP - 196

EP - 204

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

ER -

ID: 89178597