How to download zinc-22 using rsync: Difference between revisions

From DISI
Jump to navigation Jump to search
(asdf)
 
m (asdf)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
Ok, you can try this
Ok, you can try this
mkdir zinc-22d
pushd zinc-22d
rsync -Larv --include='*/' --include='zinc-22d/' --include='[a-z]/H*-*-*db2.tgz' --exclude='*' --verbose rsync://files.docking.org/ZINC22-3D .
popd


This will get you all molecules in the "d" layer of ZINC-22. with the db2 format. If you want sdf, mol2 or pdbqt, just change db2 into the relevant one.
rsync -Larv --include='*/'  --include='[a-z]/H[01]?*-*db2.tgz' --exclude='sets' --exclude='*' --verbose rsync://files.docking.org/ZINC22-3D/zinc-22<?> .
(all on one line)
where <?> is d g h i k l m n o p q r s t u v x
n is 50% of the database
x is 25%
g is "informer set"  


There are more layers, currently, we have d g h i k l m n o p q r s t u v x 
This will get you all molecules in the "?" layer of ZINC-22. with the db2 format. If you want sdf, mol2 or pdbqt, just change db2 into the relevant one.


n is biggest x is second the rest are modest. The layers have no meaning, other than they allow us to prepare the database independently in steps. If you want only a subset, then you could try using the 3D tranche browser in cartblanche22.docking.org to make a precise selection. 
We recommended starting with < H20 (thus H[01]? above) .  Once you have up to H19, add H20, H21 progressively.
Each is typically 50% bigger than the previous one. H25 and H26 together are more than 60% of the database.
You can do a lot of productive docking with H13-H16 (fragment-like) and H17-H19 (small lead like).  
 
 
 
The layers have no meaning, other than they allow us to prepare the database independently in steps. If you want only a subset, then you could try using the 3D tranche browser in cartblanche22.docking.org to make a precise selection. 


I hope this helps. 
I hope this helps. 

Latest revision as of 21:10, 3 August 2022

Ok, you can try this

rsync -Larv --include='*/'  --include='[a-z]/H[01]?*-*db2.tgz' --exclude='sets' --exclude='*' --verbose rsync://files.docking.org/ZINC22-3D/zinc-22<?> .

(all on one line)

where <?> is d g h i k l m n o p q r s t u v x 
n is 50% of the database
x is 25%
g is "informer set" 

This will get you all molecules in the "?" layer of ZINC-22. with the db2 format. If you want sdf, mol2 or pdbqt, just change db2 into the relevant one.

We recommended starting with < H20 (thus H[01]? above) . Once you have up to H19, add H20, H21 progressively. Each is typically 50% bigger than the previous one. H25 and H26 together are more than 60% of the database. You can do a lot of productive docking with H13-H16 (fragment-like) and H17-H19 (small lead like).

 

The layers have no meaning, other than they allow us to prepare the database independently in steps. If you want only a subset, then you could try using the 3D tranche browser in cartblanche22.docking.org to make a precise selection. 

I hope this helps.