Submission Details

Molecule(s):
O=C1CC[C@@H](C(=O)C[C@@H]2CCN(Nc3ccccc3)[C@H](c3ccccc3)C2)CN1

LUI-IND-a1be81af-1

O=C1CC[C@@H](C(=O)C[C@@H]2CCN(Nc3ccccc3)[C@H](c3ccccc3)C2)CN1

O=C(CNC(=O)N1CCN(C(=O)Nc2cccc(Cl)c2)CC1)N1CCOCC1

LUI-IND-a1be81af-2

O=C(CNC(=O)N1CCN(C(=O)Nc2cccc(Cl)c2)CC1)N1CCOCC1

CN(C)C(=O)C1CCN(C(=O)c2cccc(Cc3ccc4c(n3)COC[C@@H]4N)c2)CC1

LUI-IND-a1be81af-3

CN(C)C(=O)C1CCN(C(=O)c2cccc(Cc3ccc4c(n3)COC[C@@H]4N)c2)CC1


Design Rationale:

A generative language model of SMILES strings was created from the set of 829 current submissions (as of March 25, 2020). This generative model was sampled, and produced ~20,000 novel structures that bear some resemblance to the current submissions, and, indirectly, to the fragments. The QED score for each generated molecule was computed, and the top 1,025 generated molecules according to QED score were kept. Any generated molecules that were identical to the submissions were removed (only 7 were identical). Also, the Tanimoto similarity of each generated molecule was computed against each submission molecule, and any generated molecule with a Tanimoto similarity >= 0.9 was removed (generated compounds generally had low average Tanimoto similarity to the submission compounds). Each generated molecule was then converted into SDF format using RDKit, and subsequently converted from SDF to PDB and PDBQT formats using Open Babel. Next, the SARS-CoV-2 apo-Mpro protein structure from the 6YB7_model.pdb file (provided by Diamond Light Source) was converted to PDBQT format. All of the generated molecules were then docked using AutoDock Vina, and the best 3 molecules are presented here.

Inspired By:
Discussion: