ZINC-22 rearrangement of May-24: Difference between revisions

From DISI
Jump to navigation Jump to search
m (asdf)
 
(8 intermediate revisions by the same user not shown)
Line 12: Line 12:
To be clear, this number double-counts protonation states with different charges, thus if there is an imidazole and there is one protonated and one unprotonated, it counts as two. So maybe 140K really.
To be clear, this number double-counts protonation states with different charges, thus if there is an imidazole and there is one protonated and one unprotonated, it counts as two. So maybe 140K really.


== Incremental update of 3D structures ==  
== More ZINC-22 3D structures for docking ==  
* For background information about layers, see [[ZINC22:Layers]].
* For background information about layers, see [[ZINC22:Layers]].
* We have begun to release a new layer, /zinc-22y/.
* We have begun to release a new layer, /zinc-22y/.
Line 20: Line 20:
* /zinc-22y/ is available to H19 as of May 23, 2024. We expect to get up to H24 fully updated by summer. Then we will turn to H25-29.
* /zinc-22y/ is available to H19 as of May 23, 2024. We expect to get up to H24 fully updated by summer. Then we will turn to H25-29.


== Molecule counts in 2D tranche browser ==
== Molecule counts in 2D and 3D tranche browser ==
* We have updated 2D molecule counts in ZINC-22.
* We have updated 2D molecule counts in ZINC-22.
* Thus the 2D browser is now a correct summary of what we have loaded.
* Thus the 2D browser is now a correct summary of what we have loaded.
* ZINC-22 2D is now about 50% bigger. Old count was around 37B. Now around 55B.  
* ZINC-22 2D is now about 50% bigger. Old count was around 37B. Now around 55B.  
* We are updating the 3D molecule counts in ZINC-22. Work in progress.


== Smallworld and Arthor databases ==
== Smallworld and Arthor databases ==
* We have rearranged Smallworld and Arthor databases. The information is here: [[Smallworld_and_Arthor_Databases]]
* We have rearranged Smallworld and Arthor databases. The information is here: [[Smallworld Databases]]. [[Arthor Databases]]
There are now five servers of each, thus: sw (public, no pw), swp (private, pw, but available), swcc (chemistry commons), swbb (building blocks) and one more that is private to UCSF. For Arthor it is the same thing:  arthor, arthorp, arthorcc, arthorbb and a UCSF only one.
There are now five servers of each, thus: sw (public, no pw), swp (private, pw, but available), swcc (chemistry commons), swbb (building blocks) and one more that is private to UCSF. For Arthor it is the same thing:  arthor, arthorp, arthorcc, arthorbb and a UCSF only one.
 
== 3D tranche updates to follow ==
* We have been building and updating 3D tranches for about a year, and are now starting to push them to public servers.
This will happen over the coming weeks and we will announce when it is done.  


== Cartblanche22.docking.org ==
== Cartblanche22.docking.org ==
Line 39: Line 36:
* new SDI files in /zinc-22x/sets/ as of 2024-05-20
* new SDI files in /zinc-22x/sets/ as of 2024-05-20
* also available on Wynton.
* also available on Wynton.
* will be on AWS by June 1, 2024.
* will be on AWS in June, 2024.
  /wynton/group/bks/sets/ and
  /wynton/group/bks/sets/ and
  /nfs/exd/zinc-22x/sets/
  /nfs/exd/zinc-22x/sets/
Line 53: Line 50:
* Recently updated to H19.  
* Recently updated to H19.  


== We updated /wynton/group/bks/2d/ ==
54 B smiles in ZINC-22
== Synced to Wynton ==
We sync to AWS in June 2024.
== SMILES available for 3D structures ==
We have recomputed SMILES files for each small tranche, e.g. H20/H20P200/*.smi.gz.
By layers, here is where we are (May 29, 2024)
DONE: a,b,c,i, k, l, q, r, t,w,y,z
x: H18 done. stopped.  n: H20 done. stopped.  p: H20 done stopped.  m: H17 done stopped
Almost done but still running:
d H26
g H28
h H29
o H25
s H25
u H25
v H21
'We will announce when finished. Should be mostly finished except x and n past H25. That last bit will take a while.
[[Category:News]]
[[Category:News]]

Latest revision as of 23:14, 29 May 2024

A few things have happened recently, which we describe below.

Enamine Macrocycles

  • We have released a new layer /zinc-22w/, Enamine macrocycles. These are based on a private library of about 150K from Enamine as follows:
  • 104,060 H19 to H39.
  • 45,985 H40 to H49
  • 654 H50-H54

We have built to H39. Next time (summer) we will build to H49.

The 104,060 expand to 144,978K with steroisomers. That's dockable today in /zinc-22w/ To be clear, this number double-counts protonation states with different charges, thus if there is an imidazole and there is one protonated and one unprotonated, it counts as two. So maybe 140K really.

More ZINC-22 3D structures for docking

  • For background information about layers, see ZINC22:Layers.
  • We have begun to release a new layer, /zinc-22y/.
  • This is an incremental update.
  • We took all the molecules in 2D registered in ZINC to H24 and ask how many of these are _not_ available in 3D ready to dock formats.
  • We found about 4 billion such molecules, just up to H24.
  • /zinc-22y/ is available to H19 as of May 23, 2024. We expect to get up to H24 fully updated by summer. Then we will turn to H25-29.

Molecule counts in 2D and 3D tranche browser

  • We have updated 2D molecule counts in ZINC-22.
  • Thus the 2D browser is now a correct summary of what we have loaded.
  • ZINC-22 2D is now about 50% bigger. Old count was around 37B. Now around 55B.
  • We are updating the 3D molecule counts in ZINC-22. Work in progress.

Smallworld and Arthor databases

There are now five servers of each, thus: sw (public, no pw), swp (private, pw, but available), swcc (chemistry commons), swbb (building blocks) and one more that is private to UCSF. For Arthor it is the same thing: arthor, arthorp, arthorcc, arthorbb and a UCSF only one.

Cartblanche22.docking.org

  • There have been a lot of bug fixes in Cartblanche22.docking.org. It is much more reliable now than earlier versions. If you had trouble with it, please try again.

Freshly updated SDI files

  • new SDI files in /zinc-22x/sets/ as of 2024-05-20
  • also available on Wynton.
  • will be on AWS in June, 2024.
/wynton/group/bks/sets/ and
/nfs/exd/zinc-22x/sets/

They contain lists of ZINC-22 tranches organized by charge-HAC-name.suffix where

  • charge: N=neutral, M= -1, O= +1 and so on.
  • HAC is H04 to H39
  • name is lead-like (HAC 17-25), frag-like (HAC 04-16), also big, greasy-leads, big-greasy
  • suffix is txt (our lab), wyn (wynton) and s3 (AWS)

/zinc-22c/ zwitterions

  • Recently updated to H19.

We updated /wynton/group/bks/2d/

54 B smiles in ZINC-22

Synced to Wynton

We sync to AWS in June 2024.


SMILES available for 3D structures

We have recomputed SMILES files for each small tranche, e.g. H20/H20P200/*.smi.gz.

By layers, here is where we are (May 29, 2024)

DONE: a,b,c,i, k, l, q, r, t,w,y,z x: H18 done. stopped. n: H20 done. stopped. p: H20 done stopped. m: H17 done stopped

Almost done but still running: d H26 g H28 h H29 o H25 s H25 u H25 v H21

'We will announce when finished. Should be mostly finished except x and n past H25. That last bit will take a while.