As genome and gene sequencing rapidly expand, data increasingly outpace studies linking genetic variants to specific diseases, making computational methods for associating potential mutations with pathology both essential and feasible. We found that disease-causing variants associated with Myosin Storage Myopathy (MSM) generally destabilize the MYH7 α-helical coiled-coil domain more than non-disease-associated variants, and structural mapping revealed that pathogenic variants cluster in locally unwound regions of the coiled-coil dimer, suggesting that changes in these strained sites may promote dimer destabilization and aggregation. However, these features alone are insufficient to reliably predict hereditary Myosin Storage Myopathy. By integrating protein aggregation, structural stability, and additional informative features, we developed RDSM-MYH7, a machine learning-based predictor for assessing the pathogenicity of missense mutations in the MYH7 rod domain. RDSM-MYH7 achieved superior performance (F1 = 0.869, accuracy = 0.875), compared to existing tools, and can be applied to individual gene sequencing data to identify pathogenic MYH7-variants associated with storage myopathy. Its implementation in clinical screening could facilitate early diagnosis of myopathies and other hereditary protein storage diseases, in which protein unfolding precedes pathological aggregation.

Original languageEnglish
Article number108307
JournalJournal of Structural Biology
Volume218
Issue number2
Early online date2 Mar 2026
DOIs
StateE-pub ahead of print - 2 Mar 2026

ID: 150622171