s-dePooler : Determination of polymorphism carriers from overlapping DNA pools. / Zhernakov, Aleksandr Igorevich; Afonin, Alexey Mikhailovich; Gavriliuk, Natalia Dmitrievna; Moiseeva, Olga Mikhailovna; Zhukov, Vladimir Aleksandrovich.
In: BMC Bioinformatics, Vol. 20, No. 1, 45, 22.01.2019.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - s-dePooler
T2 - Determination of polymorphism carriers from overlapping DNA pools
AU - Zhernakov, Aleksandr Igorevich
AU - Afonin, Alexey Mikhailovich
AU - Gavriliuk, Natalia Dmitrievna
AU - Moiseeva, Olga Mikhailovna
AU - Zhukov, Vladimir Aleksandrovich
N1 - Publisher Copyright: © 2019 The Author(s).
PY - 2019/1/22
Y1 - 2019/1/22
N2 - Background: Samples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in large analysing populations, which is the base of such research fields as Genome-Wide Association Studies, evolutionary and population studies, etc. Usage of overlapping pools where each sample is present in multiple pools can enhance the accuracy of polymorphism detection and allow identifying carriers of rare-variants. Surprisingly there is a lack of tools for result interpretation and carrier identification, i.e. for "depooling". Results: Here we present s-dePooler, the application for analysis of pooling experiments data. s-dePooler uses the variants information (VCF-file) and the pooling scheme to produce a list of candidate carriers for each polymorphism. We incorporated s-dePooler into a pipeline (dePoP) for automation of pooling analysis. The performance of the pipeline was tested on a synthetic dataset built using the 1000 Genomes Project data, resulting in the successful identification 97% of carriers of polymorphisms present in fewer than ~ 10% of carriers. Conclusions: s-dePooler along with dePoP can be used to identify carriers of polymorphisms in overlapping pools, and is compatible with any pooling scheme with equivalent molar ratios of pooled samples. s-dePooler and dePoP with usage instructions and test data are freely available at https://github.com/lab9arriam/depop.
AB - Background: Samples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in large analysing populations, which is the base of such research fields as Genome-Wide Association Studies, evolutionary and population studies, etc. Usage of overlapping pools where each sample is present in multiple pools can enhance the accuracy of polymorphism detection and allow identifying carriers of rare-variants. Surprisingly there is a lack of tools for result interpretation and carrier identification, i.e. for "depooling". Results: Here we present s-dePooler, the application for analysis of pooling experiments data. s-dePooler uses the variants information (VCF-file) and the pooling scheme to produce a list of candidate carriers for each polymorphism. We incorporated s-dePooler into a pipeline (dePoP) for automation of pooling analysis. The performance of the pipeline was tested on a synthetic dataset built using the 1000 Genomes Project data, resulting in the successful identification 97% of carriers of polymorphisms present in fewer than ~ 10% of carriers. Conclusions: s-dePooler along with dePoP can be used to identify carriers of polymorphisms in overlapping pools, and is compatible with any pooling scheme with equivalent molar ratios of pooled samples. s-dePooler and dePoP with usage instructions and test data are freely available at https://github.com/lab9arriam/depop.
KW - Depooling
KW - DNA pools
KW - Overlapping pools
KW - Polymorphism discovery
KW - Sample pooling
UR - http://www.scopus.com/inward/record.url?scp=85060292432&partnerID=8YFLogxK
U2 - 10.1186/s12859-019-2616-9
DO - 10.1186/s12859-019-2616-9
M3 - Article
C2 - 30669964
AN - SCOPUS:85060292432
VL - 20
JO - BMC Bioinformatics
JF - BMC Bioinformatics
SN - 1471-2105
IS - 1
M1 - 45
ER -
ID: 89279348