Submission Details

Molecule(s):
O=C(c1cc2sccc2s1)N1CCOC(CN2CCOCC2)C1

JAR-KUA-672ec752-1

O=C(c1cc2sccc2s1)N1CCOC(CN2CCOCC2)C1

c1ccc(OC2CN(Cc3c[nH]cn3)C2)cc1

JAR-KUA-672ec752-2

c1ccc(OC2CN(Cc3c[nH]cn3)C2)cc1

CN1CCN(C(=O)CNc2c(S(N)(=O)=O)ccc3ccccc23)CC1

JAR-KUA-672ec752-3

CN1CCN(C(=O)CNc2c(S(N)(=O)=O)ccc3ccccc23)CC1

c1cc(CN2CC3(CCOC3)C2)c2cccnc2c1

JAR-KUA-672ec752-4

c1cc(CN2CC3(CCOC3)C2)c2cccnc2c1

O=S(=O)(Nc1nsc2ccccc12)c1ccc2ccccc2c1

JAR-KUA-672ec752-6

O=S(=O)(Nc1nsc2ccccc12)c1ccc2ccccc2c1

c1ccc(N(Cc2ccsc2)Cc2cccc(CN3CCOCC3)c2)cc1

JAR-KUA-672ec752-7

c1ccc(N(Cc2ccsc2)Cc2cccc(CN3CCOCC3)c2)cc1

c1coc(CC2CN(Cc3cc4ccccc4[nH]3)C2)c1

JAR-KUA-672ec752-8

c1coc(CC2CN(Cc3cc4ccccc4[nH]3)C2)c1

c1ccc2ncc(CN3CC(Cc4ccoc4)C3)cc2c1

JAR-KUA-672ec752-9

c1ccc2ncc(CN3CC(Cc4ccoc4)C3)cc2c1

Cc1ccc2cc(CNCC(=O)N3CCN(C)CC3)[nH]c2c1

JAR-KUA-672ec752-11

Cc1ccc2cc(CNCC(=O)N3CCN(C)CC3)[nH]c2c1

CS(=O)(=O)c1ccc(CNc2nc3ccccc3[nH]2)s1

JAR-KUA-672ec752-12

CS(=O)(=O)c1ccc(CNc2nc3ccccc3[nH]2)s1

CCc1ccc(CN2CCN(CCc3ccns3)CC2)cc1

JAR-KUA-672ec752-13

CCc1ccc(CN2CCN(CCc3ccns3)CC2)cc1


Design Rationale:

Our main goal was the discovery of new inhibitors of M-pro using machine learning.  We started with the M-pro crystallography dataset and literature binding affinity dataset,  which was carefully curated based on removing duplicates, selecting highest quality data sources, removing salts, heavy atoms etc. We used this dataset to train a deep-learning classification model based on a graph convolutional architecture. Selecting a large enough number of negative data points to train the model on was crucial to enable effective screening, as otherwise false positives end up dominating the output and destroy any meaningful chance of selecting binders. Team members working on the submission have extensive experience in getting these criteria right. Fortunately the dataset was diverse enough to enable an efficient virtual screening process, and form prior a hit rate as high as 5% would not be an unreasonable expectation. Virtual screening itself was performed on pre-curated subsets of the REAL diverse dataset using this model as a metric. In addition to this we incorporated a number of selection criteria in post processing, priority was synthesisability but also included novelty and stability.

Other Notes:

We didn't use x0072 as a fragment!

Inspired By:
Discussion: