for Context-Locked Cryptic Pockets
Abstract
Pathogenic missense mutations frequently cause disease through accelerated protein misfolding rather than direct functional disruption — a mechanism that leaves the vast majority of rare disease variants without any approved pharmacological intervention. Traditional structure-based drug discovery remains largely blind to these variants, relying on static wild-type templates that fail to capture the transient conformational trajectories unique to destabilized mutant ensembles.
To overcome this, we introduce REFOLD, a fully automated computational framework for the proteome-scale de novo synthesis of pharmacological chaperones. REFOLD couples a multi-modal GNN-Transformer fusion classifier for rescue amenability scoring with an Anisotropic Network Model (ANM) that simulates 20 mutant transition-state conformations at 1.5 Å RMSD to expose hidden, allosteric cryptic pockets undetectable in wild-type structures. A pharmacophore-guided evolutionary SMILES search, scored against RDKit molecular descriptors and the Eij pairwise Cα–Cα distance matrix of pocket-lining residues, then designs highly druglike, synthetically accessible small molecules tailored to these coordinate constraints entirely from scratch.
We validate REFOLD on two historically intractable Class II misfolding targets. For the GBA1 L444P variant (Gaucher disease Type I), the model bypassed the collapsed canonical active site (WT fpocket druggability: 0.11), identified a distal allosteric hinge at residues A40–R41–P42–C43–D63–S64–F65–R87–M88–E89–L90 (mutant pocket druggability: 0.926, volume: 462.6 ų), and generated a bis-aromatic piperidine scaffold with MW 317 Da, SA score 2.7, and QED 0.79. For the CFTR G85E variant (Cystic Fibrosis), REFOLD similarly bypassed the VX-809 canonical binding site (druggability 0.003 in mutant), instead targeting the exposed ICL1-TM2 junction (druggability: 0.807, volume: 484.0 ų) with a fluorinated piperazine scaffold of MW 316 Da and exceptional SA score 1.7. In both cases, molecules act as physical “kinetic splints” that stabilize misfolded intermediates and rescue structural integrity before ER quality control-mediated degradation.
Scaled across the full ClinVar pathogenic missense catalog (178,597 variants), REFOLD has to date generated 668 complete chaperone entries spanning 231 distinct diseases, with a mean pocket druggability of 0.835 across all accepted variants. All results are continuously published to the open-access Pharmacological Chaperone Database (PCD), providing interactive 3D transient-conformation visualizations, full Eij matrices, molecular property profiles, and SMILES strings for every entry — establishing a scalable blueprint for zero-shot orphan disease drug discovery.