Ellman libraries: Difference between revisions

From DISI
Jump to navigation Jump to search
Line 64: Line 64:
= Current Status =  
= Current Status =  


* complete: A1R30, A1S30, T1R30,T1S30, T3R30, T3S30 (also uploaded to AWS S3)
* 300 libraries are ready for docking.  They have some problems so they will be rebuilt by Feb 1.
* 3D building in progress:  A3R30, A3S30, M1R30, M1S30, M2R30, M2S30, M3R30, M3S30
* however, they are in a usable state right now
* when above complete, 300-350 ("35") libraries
* when the 300 libraries are complete:
* when previous complete, 350-400 ("40") libraries
* we will do: 300-350 ("35") libraries
* and: 350-400 ("40") libraries


[[Category:Libraries]]
[[Category:Libraries]]
[[Category:Docking]]
[[Category:Docking]]

Revision as of 20:15, 21 December 2016

This is about the Ellman libraries of tetrahydropyridines for docking. The files are stored in /nfs/db/eldock/ and are ready for immediate docking with DOCK 3.7. (db2.gz format). Each compound can be made using three commercially available building blocks:

  • an amine or aniline
  • an alkyne
  • a propenal (Michael acceptor)

All libraries so far are cut strictly at 300 amu and 3.5 LogP. We have 2D SMILES standing by for 300-350 and 350-400, which we can build rapidly on demand. (we're fairly busy just building ZINC to 100 million molecules just now, but we can always make time...)

In the name, the first letter give the reagent-basis:

  • A = aniline based
  • M = aMine based
  • T = tert-alkyne + amine based

The second letter gives the reaction #, 1 2 or 3, using Ellman's scheme.

The third letter is R or S and corresponds to the two enantiomers that can be made.

The fourth and fifth characters specify the mwt maximum, thus 30= < 300. 35 = 300-350 and 40 = 350 - 400

The chunking of molecules is very fine to allow for parallel docking. 0 = ref. 1=mid

The format of the files is /nfs/db/eldock/<LIBRARY>/<level1>/<level1>-<level2>-{ref|mid}.db2.gz

where level1 is of the form x? and is meaningless. level2 is typically of the form x?? and also meaningless.

The library names are as follows:

Aniline based

  • A1R30
  • A1S30
  • A3R30
  • A3S30

Amine based

  • M1R30
  • M1S30
  • M2R30
  • M2S30
  • M3R30
  • M3S30

Tert-alkyne / amine based

  • T1R30
  • T1S30
  • T3R30
  • T3S30

When you get a hit, to work backward to the ingredients to make the compound

In each of the above directories, there is a "key" file, which maps SMILES to Ellman code to the three building blocks used to make it.

Say for instance that the molecule T3S3000000000013 scores well in docking. The first five letters give you the directory, thus "T3S30". This tells you it is a tert-alkyne / amine based version of reaction 3, it is the "S" enantiomer, and it is 300 amu or less.

To find the ZINC codes you can do grep T3S3000000000013 /nfs/db/eldock/T3S30/T3S30.key.txt which gives you

CNC(C)(C)[C@@H]1[C@@H](C)C=C(C)CN1Cc1ccc(N)nc1	T3S3000000000013	ZINC000026895568.ZINC000001586526.ZINC000104617551

so now you know which three molecules you need to buy to make it.


/nfs/db/eldock/T3S30/T3S30.key.txt

Current Status

  • 300 libraries are ready for docking. They have some problems so they will be rebuilt by Feb 1.
  • however, they are in a usable state right now
  • when the 300 libraries are complete:
  • we will do: 300-350 ("35") libraries
  • and: 350-400 ("40") libraries