ZINC-22 rearrangement of May-24: Difference between revisions

From DISI
Jump to navigation Jump to search
 
(6 intermediate revisions by the same user not shown)
Line 20: Line 20:
* /zinc-22y/ is available to H19 as of May 23, 2024. We expect to get up to H24 fully updated by summer. Then we will turn to H25-29.
* /zinc-22y/ is available to H19 as of May 23, 2024. We expect to get up to H24 fully updated by summer. Then we will turn to H25-29.


== Molecule counts in 2D tranche browser ==
== Molecule counts in 2D and 3D tranche browser ==
* We have updated 2D molecule counts in ZINC-22.
* We have updated 2D molecule counts in ZINC-22.
* Thus the 2D browser is now a correct summary of what we have loaded.
* Thus the 2D browser is now a correct summary of what we have loaded.
* ZINC-22 2D is now about 50% bigger. Old count was around 37B. Now around 55B.  
* ZINC-22 2D is now about 50% bigger. Old count was around 37B. Now around 55B.  
* We are updating the 3D molecule counts in ZINC-22. Work in progress.


== Smallworld and Arthor databases ==
== Smallworld and Arthor databases ==
* We have rearranged Smallworld and Arthor databases. The information is here: [[Smallworld_and_Arthor_Databases]]
* We have rearranged Smallworld and Arthor databases. The information is here: [[Smallworld Databases]]. [[Arthor Databases]]
There are now five servers of each, thus: sw (public, no pw), swp (private, pw, but available), swcc (chemistry commons), swbb (building blocks) and one more that is private to UCSF. For Arthor it is the same thing:  arthor, arthorp, arthorcc, arthorbb and a UCSF only one.  
There are now five servers of each, thus: sw (public, no pw), swp (private, pw, but available), swcc (chemistry commons), swbb (building blocks) and one more that is private to UCSF. For Arthor it is the same thing:  arthor, arthorp, arthorcc, arthorbb and a UCSF only one.


== Cartblanche22.docking.org ==
== Cartblanche22.docking.org ==
Line 35: Line 36:
* new SDI files in /zinc-22x/sets/ as of 2024-05-20
* new SDI files in /zinc-22x/sets/ as of 2024-05-20
* also available on Wynton.
* also available on Wynton.
* will be on AWS by June 1, 2024.
* will be on AWS in June, 2024.
  /wynton/group/bks/sets/ and
  /wynton/group/bks/sets/ and
  /nfs/exd/zinc-22x/sets/
  /nfs/exd/zinc-22x/sets/
Line 49: Line 50:
* Recently updated to H19.  
* Recently updated to H19.  


== We updated /wynton/group/bks/2d/ ==
54 B smiles in ZINC-22
== Synced to Wynton ==
We sync to AWS in June 2024.
== SMILES available for 3D structures ==
We have recomputed SMILES files for each small tranche, e.g. H20/H20P200/*.smi.gz.
By layers, here is where we are (May 29, 2024)
DONE: a,b,c,i, k, l, q, r, t,w,y,z
x: H18 done. stopped.  n: H20 done. stopped.  p: H20 done stopped.  m: H17 done stopped
Almost done but still running:
d H26
g H28
h H29
o H25
s H25
u H25
v H21
'We will announce when finished. Should be mostly finished except x and n past H25. That last bit will take a while.
[[Category:News]]
[[Category:News]]

Latest revision as of 23:14, 29 May 2024

A few things have happened recently, which we describe below.

Enamine Macrocycles

  • We have released a new layer /zinc-22w/, Enamine macrocycles. These are based on a private library of about 150K from Enamine as follows:
  • 104,060 H19 to H39.
  • 45,985 H40 to H49
  • 654 H50-H54

We have built to H39. Next time (summer) we will build to H49.

The 104,060 expand to 144,978K with steroisomers. That's dockable today in /zinc-22w/ To be clear, this number double-counts protonation states with different charges, thus if there is an imidazole and there is one protonated and one unprotonated, it counts as two. So maybe 140K really.

More ZINC-22 3D structures for docking

  • For background information about layers, see ZINC22:Layers.
  • We have begun to release a new layer, /zinc-22y/.
  • This is an incremental update.
  • We took all the molecules in 2D registered in ZINC to H24 and ask how many of these are _not_ available in 3D ready to dock formats.
  • We found about 4 billion such molecules, just up to H24.
  • /zinc-22y/ is available to H19 as of May 23, 2024. We expect to get up to H24 fully updated by summer. Then we will turn to H25-29.

Molecule counts in 2D and 3D tranche browser

  • We have updated 2D molecule counts in ZINC-22.
  • Thus the 2D browser is now a correct summary of what we have loaded.
  • ZINC-22 2D is now about 50% bigger. Old count was around 37B. Now around 55B.
  • We are updating the 3D molecule counts in ZINC-22. Work in progress.

Smallworld and Arthor databases

There are now five servers of each, thus: sw (public, no pw), swp (private, pw, but available), swcc (chemistry commons), swbb (building blocks) and one more that is private to UCSF. For Arthor it is the same thing: arthor, arthorp, arthorcc, arthorbb and a UCSF only one.

Cartblanche22.docking.org

  • There have been a lot of bug fixes in Cartblanche22.docking.org. It is much more reliable now than earlier versions. If you had trouble with it, please try again.

Freshly updated SDI files

  • new SDI files in /zinc-22x/sets/ as of 2024-05-20
  • also available on Wynton.
  • will be on AWS in June, 2024.
/wynton/group/bks/sets/ and
/nfs/exd/zinc-22x/sets/

They contain lists of ZINC-22 tranches organized by charge-HAC-name.suffix where

  • charge: N=neutral, M= -1, O= +1 and so on.
  • HAC is H04 to H39
  • name is lead-like (HAC 17-25), frag-like (HAC 04-16), also big, greasy-leads, big-greasy
  • suffix is txt (our lab), wyn (wynton) and s3 (AWS)

/zinc-22c/ zwitterions

  • Recently updated to H19.

We updated /wynton/group/bks/2d/

54 B smiles in ZINC-22

Synced to Wynton

We sync to AWS in June 2024.


SMILES available for 3D structures

We have recomputed SMILES files for each small tranche, e.g. H20/H20P200/*.smi.gz.

By layers, here is where we are (May 29, 2024)

DONE: a,b,c,i, k, l, q, r, t,w,y,z x: H18 done. stopped. n: H20 done. stopped. p: H20 done stopped. m: H17 done stopped

Almost done but still running: d H26 g H28 h H29 o H25 s H25 u H25 v H21

'We will announce when finished. Should be mostly finished except x and n past H25. That last bit will take a while.