Research output: Contribution to journal › Article › peer-review
Linking protein aggregation and structural stability to predict pathogenic MYH7 variants via machine learning. / Пьянков, Иван Алексеевич; Кокорина, Марина; Рычков, Георгий Николаевич; Костарева, А. А.; Успенская, Майя Валерьевна; Каява, Андрей Вилхович.
In: Journal of Structural Biology, Vol. 218, No. 2, 108307, 02.06.2026.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Linking protein aggregation and structural stability to predict pathogenic MYH7 variants via machine learning
AU - Пьянков, Иван Алексеевич
AU - Кокорина, Марина
AU - Рычков, Георгий Николаевич
AU - Костарева, А. А.
AU - Успенская, Майя Валерьевна
AU - Каява, Андрей Вилхович
PY - 2026/3/2
Y1 - 2026/3/2
N2 - As genome and gene sequencing rapidly expand, data increasingly outpace studies linking genetic variants to specific diseases, making computational methods for associating potential mutations with pathology both essential and feasible. We found that disease-causing variants associated with Myosin Storage Myopathy (MSM) generally destabilize the MYH7 α-helical coiled-coil domain more than non-disease-associated variants, and structural mapping revealed that pathogenic variants cluster in locally unwound regions of the coiled-coil dimer, suggesting that changes in these strained sites may promote dimer destabilization and aggregation. However, these features alone are insufficient to reliably predict hereditary Myosin Storage Myopathy. By integrating protein aggregation, structural stability, and additional informative features, we developed RDSM-MYH7, a machine learning-based predictor for assessing the pathogenicity of missense mutations in the MYH7 rod domain. RDSM-MYH7 achieved superior performance (F1 = 0.869, accuracy = 0.875), compared to existing tools, and can be applied to individual gene sequencing data to identify pathogenic MYH7-variants associated with storage myopathy. Its implementation in clinical screening could facilitate early diagnosis of myopathies and other hereditary protein storage diseases, in which protein unfolding precedes pathological aggregation.
AB - As genome and gene sequencing rapidly expand, data increasingly outpace studies linking genetic variants to specific diseases, making computational methods for associating potential mutations with pathology both essential and feasible. We found that disease-causing variants associated with Myosin Storage Myopathy (MSM) generally destabilize the MYH7 α-helical coiled-coil domain more than non-disease-associated variants, and structural mapping revealed that pathogenic variants cluster in locally unwound regions of the coiled-coil dimer, suggesting that changes in these strained sites may promote dimer destabilization and aggregation. However, these features alone are insufficient to reliably predict hereditary Myosin Storage Myopathy. By integrating protein aggregation, structural stability, and additional informative features, we developed RDSM-MYH7, a machine learning-based predictor for assessing the pathogenicity of missense mutations in the MYH7 rod domain. RDSM-MYH7 achieved superior performance (F1 = 0.869, accuracy = 0.875), compared to existing tools, and can be applied to individual gene sequencing data to identify pathogenic MYH7-variants associated with storage myopathy. Its implementation in clinical screening could facilitate early diagnosis of myopathies and other hereditary protein storage diseases, in which protein unfolding precedes pathological aggregation.
UR - https://linkinghub.elsevier.com/retrieve/pii/S1047847726000237
UR - https://www.mendeley.com/catalogue/f7a25961-4457-3108-be3d-8ad8442cefd4/
U2 - 10.1016/j.jsb.2026.108307
DO - 10.1016/j.jsb.2026.108307
M3 - Article
C2 - 41780808
VL - 218
JO - Journal of Structural Biology
JF - Journal of Structural Biology
SN - 1047-8477
IS - 2
M1 - 108307
ER -
ID: 150622171