Ellman libraries
This is about the Ellman libraries of tetrahydropyridines for docking. The files are stored in /nfs/db/eldock/ and are ready for immediate docking with DOCK 3.7. (db2.gz format). Each compound can be made using three commercially available building blocks:
- an amine or aniline
- an alkyne
- a propenal (Michael acceptor)
All libraries so far are cut strictly at 300 amu and 3.5 LogP. We have 2D SMILES standing by for 300-350 and 350-400, which we can build rapidly on demand. (we're fairly busy just building ZINC to 100 million molecules just now, but we can always make time...)
In the name, the first letter give the reagent-basis:
- A = aniline based
- M = aMine based
- T = tert-alkyne + amine based
The second letter gives the reaction #, 1 2 or 3, using Ellman's scheme.
The third letter is R or S and corresponds to the two enantiomers that can be made.
The fourth and fifth characters specify the mwt maximum, thus 30= < 300. 35 = 300-350 and 40 = 350 - 400
The chunking of molecules is very fine to allow for parallel docking. 0 = ref. 1=mid
The format of the files is /nfs/db/eldock/<LIBRARY>/<level1>/<level1>-<level2>-{ref|mid}.db2.gz
where level1 is of the form x? and is meaningless. level2 is typically of the form x?? and also meaningless.
The library names are as follows:
Aniline based
- A1R30
- A1S30
- A3R30
- A3S30
Amine based
- M1R30
- M1S30
- M2R30
- M2S30
- M3R30
- M3S30
Tert-alkyne / amine based
- T1R30
- T1S30
- T3R30
- T3S30
When you get a hit, to work backward to the ingredients to make the compound
In each of the above directories, there is a "key" file, which maps SMILES to Ellman code to the three building blocks used to make it.
Say for instance that the molecule T3S3000000000013 scores well in docking. The first five letters give you the directory, thus "T3S30". This tells you it is a tert-alkyne / amine based version of reaction 3, it is the "S" enantiomer, and it is 300 amu or less.
To find the ZINC codes you can do grep T3S3000000000013 /nfs/db/eldock/T3S30/T3S30.key.txt which gives you
CNC(C)(C)[C@@H]1[C@@H](C)C=C(C)CN1Cc1ccc(N)nc1 T3S3000000000013 ZINC000026895568.ZINC000001586526.ZINC000104617551
so now you know which three molecules you need to buy to make it.
/nfs/db/eldock/T3S30/T3S30.key.txt
Current Status
- complete: A1R30, A1S30, T1R30,T1S30, T3R30, T3S30 (also uploaded to AWS S3)
- 3D building in progress: A3R30, A3S30, M1R30, M1S30, M2R30, M2S30, M3R30, M3S30
- when above complete, 300-350 ("35") libraries
- when previous complete, 350-400 ("40") libraries