Research output: Contribution to journal › Article › peer-review
Mouse chromocenters DNA content : Sequencing and in silico analysis. / Ostromyshenskii, Dmitrii I.; Chernyaeva, Ekaterina N.; Kuznetsova, Inna S.; Podgornaya, Olga I.
In: BMC Genomics, Vol. 19, No. 1, 151, 20.02.2018, p. 151.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Mouse chromocenters DNA content
T2 - Sequencing and in silico analysis
AU - Ostromyshenskii, Dmitrii I.
AU - Chernyaeva, Ekaterina N.
AU - Kuznetsova, Inna S.
AU - Podgornaya, Olga I.
PY - 2018/2/20
Y1 - 2018/2/20
N2 - Background: Chromocenters are defined as a punctate condensed blocks of chromatin in the interphase cell nuclei of certain cell types with unknown biological significance. In recent years a progress in revealing of chromocenters protein content has been made although the details of DNA content within constitutive heterochromatin still remain unclear. It is known that these regions are enriched in tandem repeats (TR) and transposable elements. Quick improvement of genome sequencing does not help to assemble the heterochromatic regions due to lack of appropriate bioinformatics techniques. Results: Chromocenters DNA have been isolated by a biochemical approach from mouse liver cells nuclei and sequenced on the Illumina MiSeq resulting in ChrmC dataset. Analysis of ChrmC dataset by the bioinformatics tools available revealed that the major component of chromocenter DNA are TRs: ~ 66% MaSat and ~ 4% MiSat. Other previously classified TR families constitute ~ 1% of ChrmC dataset. About 6% of chromocenters DNA are mostly unannotated sequences. In the contigs assembled with IDBA_UD there are many fragments of heterochromatic Y-chromosome, rDNA and other pseudo-genes and non-coding DNA. A protein coding sfi1 homolog gene fragment was also found in contigs. The Sfi1 homolog gene is located on the chromosome 11 in the reference genome very close to the Golden Pass Gap (a ~ 3 Mb empty region reserved to the pericentromeric region) and proves the purity of chromocenters isolation. The second major fraction are non-LTR retroposons (SINE and LINE) with overwhelming majority of LINE - ~ 11% of ChrmC. Most of the LINE fragments are from the ~ 2 kb region at the end of the 2nd ORF and its' flanking region. The precise LINEs' segment of ~ 2 kb is the necessary mouse constitutive heterohromatin component together with TR. The third most abundant fraction are ERVs. The ERV distribution in chromocenters differs from the whole genome: IAP (ERV2 class) is the most numerous in ChrmC while MaLR (ERV3 class) prevails in the reference genome. IAP and its LTR also prevail in TR containing contigs extracted from the WGS dataset. In silico prediction of IAP and LINE fragments in chromocenters was confirmed by direct fluorescent in situ hybridization (FISH). Conclusion: Our data of chromocenters' DNA (ChrmC) sequencing demonstrate that IAP with LTR and a precise ~ 2 kb fragment of LINE represent a substantial fraction of mouse chromocenters (constitutive heterochromatin) along with TRs.
AB - Background: Chromocenters are defined as a punctate condensed blocks of chromatin in the interphase cell nuclei of certain cell types with unknown biological significance. In recent years a progress in revealing of chromocenters protein content has been made although the details of DNA content within constitutive heterochromatin still remain unclear. It is known that these regions are enriched in tandem repeats (TR) and transposable elements. Quick improvement of genome sequencing does not help to assemble the heterochromatic regions due to lack of appropriate bioinformatics techniques. Results: Chromocenters DNA have been isolated by a biochemical approach from mouse liver cells nuclei and sequenced on the Illumina MiSeq resulting in ChrmC dataset. Analysis of ChrmC dataset by the bioinformatics tools available revealed that the major component of chromocenter DNA are TRs: ~ 66% MaSat and ~ 4% MiSat. Other previously classified TR families constitute ~ 1% of ChrmC dataset. About 6% of chromocenters DNA are mostly unannotated sequences. In the contigs assembled with IDBA_UD there are many fragments of heterochromatic Y-chromosome, rDNA and other pseudo-genes and non-coding DNA. A protein coding sfi1 homolog gene fragment was also found in contigs. The Sfi1 homolog gene is located on the chromosome 11 in the reference genome very close to the Golden Pass Gap (a ~ 3 Mb empty region reserved to the pericentromeric region) and proves the purity of chromocenters isolation. The second major fraction are non-LTR retroposons (SINE and LINE) with overwhelming majority of LINE - ~ 11% of ChrmC. Most of the LINE fragments are from the ~ 2 kb region at the end of the 2nd ORF and its' flanking region. The precise LINEs' segment of ~ 2 kb is the necessary mouse constitutive heterohromatin component together with TR. The third most abundant fraction are ERVs. The ERV distribution in chromocenters differs from the whole genome: IAP (ERV2 class) is the most numerous in ChrmC while MaLR (ERV3 class) prevails in the reference genome. IAP and its LTR also prevail in TR containing contigs extracted from the WGS dataset. In silico prediction of IAP and LINE fragments in chromocenters was confirmed by direct fluorescent in situ hybridization (FISH). Conclusion: Our data of chromocenters' DNA (ChrmC) sequencing demonstrate that IAP with LTR and a precise ~ 2 kb fragment of LINE represent a substantial fraction of mouse chromocenters (constitutive heterochromatin) along with TRs.
KW - Animals
KW - Chromosome Mapping
KW - Chromosomes, Mammalian
KW - Computational Biology/methods
KW - Databases, Nucleic Acid
KW - Endogenous Retroviruses/genetics
KW - Heterochromatin/genetics
KW - High-Throughput Nucleotide Sequencing
KW - In Situ Hybridization, Fluorescence
KW - Long Interspersed Nucleotide Elements
KW - Mice
KW - Molecular Sequence Annotation
KW - Repetitive Sequences, Nucleic Acid
KW - Tandem Repeat Sequences
KW - CELLS
KW - SATELLITE DNA
KW - CHROMATIN ORGANIZATION
KW - CENTROMERE
KW - EVOLUTION
KW - HUMAN GENOME
KW - MINOR SATELLITE
KW - TANDEM REPEATS
KW - HETEROCHROMATIN
KW - REPETITIVE DNA
UR - http://www.scopus.com/inward/record.url?scp=85042163679&partnerID=8YFLogxK
UR - http://www.mendeley.com/research/mouse-chromocenters-dna-content-sequencing-silico-analysis
U2 - 10.1186/s12864-018-4534-z
DO - 10.1186/s12864-018-4534-z
M3 - Article
C2 - 29458329
AN - SCOPUS:85042163679
VL - 19
SP - 151
JO - BMC Genomics
JF - BMC Genomics
SN - 1471-2164
IS - 1
M1 - 151
ER -
ID: 36790383