ZINC via curl

From DISI
Revision as of 21:08, 11 March 2020 by Jwagsta (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

You can query ZINC using curl as follows:

Download all the IDs in a file

curl -s --data-urlencode zinc.ids="( cat test.list )" -d page.format=properties http://zinc.docking.org/results


Suppose your file is big, then use the following script to chunk it up by 1000 molecules at a time:

split -l 1000 test.list
foreach i (x??)
       curl -o $i.mol2 --data-urlencode zinc.ids="(cat $i)" -d page.format=mol2 http://zinc.docking.org/results
end

Get all the information about an ID (March 2020)

curl http://zinc15.docking.org/substances.txt  -F zinc_id-in=@./zinc.txt -F output_fields="zinc_id mwt num_atoms num_rings"

Output fields can be found at http://zinc15.docking.org/substances/help/ zinc.txt should look like:

7 
10
ZINC92349234

Get all the information about an ID

This includes target annotations and such, using the "800-lb-gorilla" format

curl -s --data-urlencode zinc.ids="$( cat test.list )" -d page.format=800-lb-gorilla http://zinc.docking.org/results

This format is basically tab-delimited, with a few columns that are semi-colon sub-delimited (and column sub-sub-delimited). It's meant to be used with awk. I can give you json output if you'd like to format it yourself.