Difference between revisions of "ZINC:FAQ"

From DISI
Jump to navigation Jump to search
Line 17: Line 17:
  
  
Note this currently only gets a single (pH 7) representation of each molecule.
+
l indicates the pH model. 0=reference (pH 7), 1=mid (5.75-8.25), 2=hi (7-8.5), 3=lo (4.5-6)
 +
 
 +
 
 +
Q2. I want a hierarchy format database based on ZINC IDs.
 +
 
 +
A2.
 +
  create file "hits.txt" containing one ZINC ID per row.
 +
  sed -e 's/^/fget2.pl?f=h\&l=0\&z=/' hits.txt > ref.txt
 +
  wget -O ref.db -a listing -B http://zinc.docking.org/ -i ref.txt
 +
The previous line gets the "reference" (pH 7) models. For additional "usual" forms, use l=1.
 +
  sed -e 's/^/fget2.pl?f=h\&l=1\&z=/' hits.txt > mid.txt
 +
  wget -O mid.db -a listing -B http://zinc.docking.org/ -i mid.txt
 +
Note we recommend splitting hits.txt into sets of 1000 ZINC IDs each, thus:
 +
  split -l hits.txt
 +
  foreach i (x??)
 +
    sed ...
 +
    wget ...
 +
  end
 +
Please let us know if this is not clear!
 +
 
 +
 
 +
 
 +
-- John Irwin, March 2009
  
  
 
[[Category:FAQ]]
 
[[Category:FAQ]]
 
[[Category:ZINC]]
 
[[Category:ZINC]]

Revision as of 22:24, 19 March 2009

Here are frequently asked questions about ZINC.


Q1. I am trying to generate a subset of your "drug-like" molecule subset for virtual screening. I was thinking your 60% diversity group (about 12,000 molecules) would be a place to start, and I downloaded the .smi file. I relatively new to chemoinformatics and I was wondering if there is an elegant way to separate the compounds listed in the .smi file from the larger library containing the mol2 files from the 2,000,000 "usual" set that I have downloaded from ZINC?

A1.

wget http://zinc8.docking.org/subset1/3/3_t60.smi
awk '{print $2}' 3_t60.smi >! codes
sed -e 's/^/fget2.pl?f=m\&l=0\&z=/' codes  > codes2
wget -O all.mol2 -a listing  -B http://zinc8.docking.org/ -i codes2


l indicates the pH model. 0=reference (pH 7), 1=mid (5.75-8.25), 2=hi (7-8.5), 3=lo (4.5-6)


Q2. I want a hierarchy format database based on ZINC IDs.

A2.

 create file "hits.txt" containing one ZINC ID per row.
 sed -e 's/^/fget2.pl?f=h\&l=0\&z=/' hits.txt > ref.txt
 wget -O ref.db -a listing -B http://zinc.docking.org/ -i ref.txt

The previous line gets the "reference" (pH 7) models. For additional "usual" forms, use l=1.

 sed -e 's/^/fget2.pl?f=h\&l=1\&z=/' hits.txt > mid.txt
 wget -O mid.db -a listing -B http://zinc.docking.org/ -i mid.txt

Note we recommend splitting hits.txt into sets of 1000 ZINC IDs each, thus:

 split -l hits.txt
 foreach i (x??)
    sed ...
    wget ...
 end

Please let us know if this is not clear!


-- John Irwin, March 2009