Research output: Contribution to journal › Article › peer-review
Catching hidden variation : Systematic correction of reference minor allele annotation in clinical variant calling. / Barbitoff, Yury A.; Bezdvornykh, Igor V.; Polev, Dmitrii E.; Serebryakova, Elena A.; Glotov, Andrey S.; Glotov, Oleg S.; Predeus, Alexander V.
In: Genetics in Medicine, Vol. 20, No. 3, 01.03.2018, p. 360-364.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Catching hidden variation
T2 - Systematic correction of reference minor allele annotation in clinical variant calling
AU - Barbitoff, Yury A.
AU - Bezdvornykh, Igor V.
AU - Polev, Dmitrii E.
AU - Serebryakova, Elena A.
AU - Glotov, Andrey S.
AU - Glotov, Oleg S.
AU - Predeus, Alexander V.
N1 - Publisher Copyright: © 2018 American College of Medical Genetics and Genomics. Copyright: Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018/3/1
Y1 - 2018/3/1
N2 - Purpose: We comprehensively assessed the influence of reference minor alleles (RMAs), one of the inherent problems of the human reference genome sequence. Methods: The variant call format (VCF) files provided by the 1000 Genomes and Exome Aggregation Consortium (ExAC) consortia were used to identify RMA sites. All coding RMA sites were checked for concordance with UniProt and the presence of same codon variants. RMA-corrected predictions of functional effect were obtained with SIFT, PolyPhen-2, and PROVEAN standalone tools and compared with dbNSFP v2.9 for consistency. Results: We systematically characterized the problem of RMAs and identified several possible ways in which RMA could interfere with accurate variant discovery and annotation. We have discovered a systematic bias in the automated variant effect prediction at the RMA loci, as well as widespread switching of functional consequences for variants located in the same codon as the RMA. As a convenient way to address the problem of RMAs we have developed a simple bioinformatic tool that identifies variation at RMA sites and provides correct annotations for all such substitutions. The tool is available free of charge at http://rmahunter.bioinf.me. Conclusion: Correction of RMA annotation enhances the accuracy of next-generation sequencing-based methods in clinical practice.
AB - Purpose: We comprehensively assessed the influence of reference minor alleles (RMAs), one of the inherent problems of the human reference genome sequence. Methods: The variant call format (VCF) files provided by the 1000 Genomes and Exome Aggregation Consortium (ExAC) consortia were used to identify RMA sites. All coding RMA sites were checked for concordance with UniProt and the presence of same codon variants. RMA-corrected predictions of functional effect were obtained with SIFT, PolyPhen-2, and PROVEAN standalone tools and compared with dbNSFP v2.9 for consistency. Results: We systematically characterized the problem of RMAs and identified several possible ways in which RMA could interfere with accurate variant discovery and annotation. We have discovered a systematic bias in the automated variant effect prediction at the RMA loci, as well as widespread switching of functional consequences for variants located in the same codon as the RMA. As a convenient way to address the problem of RMAs we have developed a simple bioinformatic tool that identifies variation at RMA sites and provides correct annotations for all such substitutions. The tool is available free of charge at http://rmahunter.bioinf.me. Conclusion: Correction of RMA annotation enhances the accuracy of next-generation sequencing-based methods in clinical practice.
KW - Allele frequency
KW - Minor allele
KW - Next-generation sequencing
KW - Reference
KW - Whole-exome sequencing
UR - http://www.scopus.com/inward/record.url?scp=85044572708&partnerID=8YFLogxK
U2 - 10.1038/gim.2017.168
DO - 10.1038/gim.2017.168
M3 - Article
C2 - 29155419
VL - 20
SP - 360
EP - 364
JO - Genetics in Medicine
JF - Genetics in Medicine
SN - 1098-3600
IS - 3
ER -
ID: 9133014