Standard

Mouse chromocenters DNA content : Sequencing and in silico analysis. / Ostromyshenskii, Dmitrii I.; Chernyaeva, Ekaterina N.; Kuznetsova, Inna S.; Podgornaya, Olga I.

In: BMC Genomics, Vol. 19, No. 1, 151, 20.02.2018, p. 151.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Author

Ostromyshenskii, Dmitrii I. ; Chernyaeva, Ekaterina N. ; Kuznetsova, Inna S. ; Podgornaya, Olga I. / Mouse chromocenters DNA content : Sequencing and in silico analysis. In: BMC Genomics. 2018 ; Vol. 19, No. 1. pp. 151.

BibTeX

@article{8409a07b3f684f1f9c23c1da348e19c9,
title = "Mouse chromocenters DNA content: Sequencing and in silico analysis",
abstract = "Background: Chromocenters are defined as a punctate condensed blocks of chromatin in the interphase cell nuclei of certain cell types with unknown biological significance. In recent years a progress in revealing of chromocenters protein content has been made although the details of DNA content within constitutive heterochromatin still remain unclear. It is known that these regions are enriched in tandem repeats (TR) and transposable elements. Quick improvement of genome sequencing does not help to assemble the heterochromatic regions due to lack of appropriate bioinformatics techniques. Results: Chromocenters DNA have been isolated by a biochemical approach from mouse liver cells nuclei and sequenced on the Illumina MiSeq resulting in ChrmC dataset. Analysis of ChrmC dataset by the bioinformatics tools available revealed that the major component of chromocenter DNA are TRs: ~ 66% MaSat and ~ 4% MiSat. Other previously classified TR families constitute ~ 1% of ChrmC dataset. About 6% of chromocenters DNA are mostly unannotated sequences. In the contigs assembled with IDBA_UD there are many fragments of heterochromatic Y-chromosome, rDNA and other pseudo-genes and non-coding DNA. A protein coding sfi1 homolog gene fragment was also found in contigs. The Sfi1 homolog gene is located on the chromosome 11 in the reference genome very close to the Golden Pass Gap (a ~ 3 Mb empty region reserved to the pericentromeric region) and proves the purity of chromocenters isolation. The second major fraction are non-LTR retroposons (SINE and LINE) with overwhelming majority of LINE - ~ 11% of ChrmC. Most of the LINE fragments are from the ~ 2 kb region at the end of the 2nd ORF and its' flanking region. The precise LINEs' segment of ~ 2 kb is the necessary mouse constitutive heterohromatin component together with TR. The third most abundant fraction are ERVs. The ERV distribution in chromocenters differs from the whole genome: IAP (ERV2 class) is the most numerous in ChrmC while MaLR (ERV3 class) prevails in the reference genome. IAP and its LTR also prevail in TR containing contigs extracted from the WGS dataset. In silico prediction of IAP and LINE fragments in chromocenters was confirmed by direct fluorescent in situ hybridization (FISH). Conclusion: Our data of chromocenters' DNA (ChrmC) sequencing demonstrate that IAP with LTR and a precise ~ 2 kb fragment of LINE represent a substantial fraction of mouse chromocenters (constitutive heterochromatin) along with TRs.",
keywords = "Animals, Chromosome Mapping, Chromosomes, Mammalian, Computational Biology/methods, Databases, Nucleic Acid, Endogenous Retroviruses/genetics, Heterochromatin/genetics, High-Throughput Nucleotide Sequencing, In Situ Hybridization, Fluorescence, Long Interspersed Nucleotide Elements, Mice, Molecular Sequence Annotation, Repetitive Sequences, Nucleic Acid, Tandem Repeat Sequences, CELLS, SATELLITE DNA, CHROMATIN ORGANIZATION, CENTROMERE, EVOLUTION, HUMAN GENOME, MINOR SATELLITE, TANDEM REPEATS, HETEROCHROMATIN, REPETITIVE DNA",
author = "Ostromyshenskii, {Dmitrii I.} and Chernyaeva, {Ekaterina N.} and Kuznetsova, {Inna S.} and Podgornaya, {Olga I.}",
year = "2018",
month = feb,
day = "20",
doi = "10.1186/s12864-018-4534-z",
language = "English",
volume = "19",
pages = "151",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central Ltd.",
number = "1",

}

RIS

TY - JOUR

T1 - Mouse chromocenters DNA content

T2 - Sequencing and in silico analysis

AU - Ostromyshenskii, Dmitrii I.

AU - Chernyaeva, Ekaterina N.

AU - Kuznetsova, Inna S.

AU - Podgornaya, Olga I.

PY - 2018/2/20

Y1 - 2018/2/20

N2 - Background: Chromocenters are defined as a punctate condensed blocks of chromatin in the interphase cell nuclei of certain cell types with unknown biological significance. In recent years a progress in revealing of chromocenters protein content has been made although the details of DNA content within constitutive heterochromatin still remain unclear. It is known that these regions are enriched in tandem repeats (TR) and transposable elements. Quick improvement of genome sequencing does not help to assemble the heterochromatic regions due to lack of appropriate bioinformatics techniques. Results: Chromocenters DNA have been isolated by a biochemical approach from mouse liver cells nuclei and sequenced on the Illumina MiSeq resulting in ChrmC dataset. Analysis of ChrmC dataset by the bioinformatics tools available revealed that the major component of chromocenter DNA are TRs: ~ 66% MaSat and ~ 4% MiSat. Other previously classified TR families constitute ~ 1% of ChrmC dataset. About 6% of chromocenters DNA are mostly unannotated sequences. In the contigs assembled with IDBA_UD there are many fragments of heterochromatic Y-chromosome, rDNA and other pseudo-genes and non-coding DNA. A protein coding sfi1 homolog gene fragment was also found in contigs. The Sfi1 homolog gene is located on the chromosome 11 in the reference genome very close to the Golden Pass Gap (a ~ 3 Mb empty region reserved to the pericentromeric region) and proves the purity of chromocenters isolation. The second major fraction are non-LTR retroposons (SINE and LINE) with overwhelming majority of LINE - ~ 11% of ChrmC. Most of the LINE fragments are from the ~ 2 kb region at the end of the 2nd ORF and its' flanking region. The precise LINEs' segment of ~ 2 kb is the necessary mouse constitutive heterohromatin component together with TR. The third most abundant fraction are ERVs. The ERV distribution in chromocenters differs from the whole genome: IAP (ERV2 class) is the most numerous in ChrmC while MaLR (ERV3 class) prevails in the reference genome. IAP and its LTR also prevail in TR containing contigs extracted from the WGS dataset. In silico prediction of IAP and LINE fragments in chromocenters was confirmed by direct fluorescent in situ hybridization (FISH). Conclusion: Our data of chromocenters' DNA (ChrmC) sequencing demonstrate that IAP with LTR and a precise ~ 2 kb fragment of LINE represent a substantial fraction of mouse chromocenters (constitutive heterochromatin) along with TRs.

AB - Background: Chromocenters are defined as a punctate condensed blocks of chromatin in the interphase cell nuclei of certain cell types with unknown biological significance. In recent years a progress in revealing of chromocenters protein content has been made although the details of DNA content within constitutive heterochromatin still remain unclear. It is known that these regions are enriched in tandem repeats (TR) and transposable elements. Quick improvement of genome sequencing does not help to assemble the heterochromatic regions due to lack of appropriate bioinformatics techniques. Results: Chromocenters DNA have been isolated by a biochemical approach from mouse liver cells nuclei and sequenced on the Illumina MiSeq resulting in ChrmC dataset. Analysis of ChrmC dataset by the bioinformatics tools available revealed that the major component of chromocenter DNA are TRs: ~ 66% MaSat and ~ 4% MiSat. Other previously classified TR families constitute ~ 1% of ChrmC dataset. About 6% of chromocenters DNA are mostly unannotated sequences. In the contigs assembled with IDBA_UD there are many fragments of heterochromatic Y-chromosome, rDNA and other pseudo-genes and non-coding DNA. A protein coding sfi1 homolog gene fragment was also found in contigs. The Sfi1 homolog gene is located on the chromosome 11 in the reference genome very close to the Golden Pass Gap (a ~ 3 Mb empty region reserved to the pericentromeric region) and proves the purity of chromocenters isolation. The second major fraction are non-LTR retroposons (SINE and LINE) with overwhelming majority of LINE - ~ 11% of ChrmC. Most of the LINE fragments are from the ~ 2 kb region at the end of the 2nd ORF and its' flanking region. The precise LINEs' segment of ~ 2 kb is the necessary mouse constitutive heterohromatin component together with TR. The third most abundant fraction are ERVs. The ERV distribution in chromocenters differs from the whole genome: IAP (ERV2 class) is the most numerous in ChrmC while MaLR (ERV3 class) prevails in the reference genome. IAP and its LTR also prevail in TR containing contigs extracted from the WGS dataset. In silico prediction of IAP and LINE fragments in chromocenters was confirmed by direct fluorescent in situ hybridization (FISH). Conclusion: Our data of chromocenters' DNA (ChrmC) sequencing demonstrate that IAP with LTR and a precise ~ 2 kb fragment of LINE represent a substantial fraction of mouse chromocenters (constitutive heterochromatin) along with TRs.

KW - Animals

KW - Chromosome Mapping

KW - Chromosomes, Mammalian

KW - Computational Biology/methods

KW - Databases, Nucleic Acid

KW - Endogenous Retroviruses/genetics

KW - Heterochromatin/genetics

KW - High-Throughput Nucleotide Sequencing

KW - In Situ Hybridization, Fluorescence

KW - Long Interspersed Nucleotide Elements

KW - Mice

KW - Molecular Sequence Annotation

KW - Repetitive Sequences, Nucleic Acid

KW - Tandem Repeat Sequences

KW - CELLS

KW - SATELLITE DNA

KW - CHROMATIN ORGANIZATION

KW - CENTROMERE

KW - EVOLUTION

KW - HUMAN GENOME

KW - MINOR SATELLITE

KW - TANDEM REPEATS

KW - HETEROCHROMATIN

KW - REPETITIVE DNA

UR - http://www.scopus.com/inward/record.url?scp=85042163679&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/mouse-chromocenters-dna-content-sequencing-silico-analysis

U2 - 10.1186/s12864-018-4534-z

DO - 10.1186/s12864-018-4534-z

M3 - Article

C2 - 29458329

AN - SCOPUS:85042163679

VL - 19

SP - 151

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 1

M1 - 151

ER -

ID: 36790383