Research output: Chapter in Book/Report/Conference proceeding › Entry for encyclopedia/dictionary › Research › peer-review
Genome sequence databases: Sequencing and assembly. / Lapidus, A. L.
Sequence Databases: Sequencing and Assembly. : Reference Module in Biomedical Sciences Encyclopedia of Microbiology. Fourth Edition. ed. Elsevier, 2019. p. 400-418 (Encyclopedia of Microbiology).Research output: Chapter in Book/Report/Conference proceeding › Entry for encyclopedia/dictionary › Research › peer-review
}
TY - CHAP
T1 - Genome sequence databases: Sequencing and assembly
AU - Lapidus, A. L.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three-dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger) and 10 000 bp (PacBio)), great interest in analysis of microbial communities (metagenomes) of different complexities inhabiting seas, marshes and even the human gut raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 99.5% of the genome, very often, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/10 000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality (~ 1 error/100 000 bp or better), validated through a number of computer and laboratory experiments.
AB - From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three-dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger) and 10 000 bp (PacBio)), great interest in analysis of microbial communities (metagenomes) of different complexities inhabiting seas, marshes and even the human gut raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 99.5% of the genome, very often, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/10 000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality (~ 1 error/100 000 bp or better), validated through a number of computer and laboratory experiments.
KW - Algorithm
KW - Contig
KW - DNA sequencing
KW - Genome finishing
KW - Misassembly
KW - Next generation sequencing
KW - Read
KW - Repeat
KW - Scaffold
KW - Whole-genome shotgun assembly
UR - https://www.mendeley.com/catalogue/abd8b32d-8e52-3b21-8a59-4b18adca3cd1/
U2 - 10.1016/B978-0-12-801238-3.02495-8
DO - 10.1016/B978-0-12-801238-3.02495-8
M3 - статья в энциклопедии, словаре, справочнике
SN - 9780128117378
T3 - Encyclopedia of Microbiology
SP - 400
EP - 418
BT - Sequence Databases: Sequencing and Assembly.
PB - Elsevier
ER -
ID: 64747212