Tytuł pozycji:
Electron density-based GPT for optimization and suggestion of host–guest binders
Here we present a machine learning model trained on electron density for the production of host–guest binders. These are read out as simplified molecular-input line-entry system (SMILES) format with >98% accuracy, enabling a complete characterization of the molecules in two dimensions. Our model generates three-dimensional representations of the electron density and electrostatic potentials of host–guest systems using a variational autoencoder, and then utilizes these representations to optimize the generation of guests via gradient descent. Finally the guests are converted to SMILES using a transformer. The successful practical application of our model to established molecular host systems, cucurbit[n]uril and metal–organic cages, resulted in the discovery of 9 previously validated guests for CB[6] and 7 unreported guests (with association constant Ka ranging from 13.5 M−1 to 5,470 M−1) and the discovery of 4 unreported guests for [Pd214]4+ (with Ka ranging from 44 M−1 to 529 M−1).
EPSRC (grant nos. EP/L023652/1, EP/R020914/1, EP/S030603/1, EP/R01308X/1, EP/S017046/1 and EP/S019472/1);
ERC (project no. 670467 SMART-POM);
EC (project no. 766975 MADONNA);
DARPA (project nos. W911NF-18- 2-0036, W911NF-17-1-0316 and HR001119S0003);
Polish National Agency for Academic Exchange grant number PPN/PPO/2020/1/00034;
National Science Center Poland grant number 2021/01/1/ST4/00007.