ZINC via curl: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
 
(4 intermediate revisions by 2 users not shown)
Line 3: Line 3:
= Download all the IDs in a file =  
= Download all the IDs in a file =  


  curl -s --data-urlencode zinc.ids="$( cat test.list )" -d page.format=properties http://zinc.docking.org/results
  curl -s --data-urlencode zinc.ids="( cat test.list )" -d page.format=properties http://zinc.docking.org/results
 
 
Suppose your file is big, then use the following script to chunk it up by 1000 molecules at a time:
 
split -l 1000 test.list
foreach i (x??)
        curl -o $i.mol2 --data-urlencode zinc.ids="(cat $i)" -d page.format=mol2 http://zinc.docking.org/results
end
 
= Get all the information about an ID (March 2020) =
 
curl http://zinc15.docking.org/substances.txt  -F zinc_id-in=@./zinc.txt -F output_fields="zinc_id mwt num_atoms num_rings"
 
Output fields can be found at http://zinc15.docking.org/substances/help/
zinc.txt should look like:
 
7
10
ZINC92349234


= Get all the information about an ID =  
= Get all the information about an ID =  
Line 12: Line 31:
This format is basically tab-delimited, with a few columns that are semi-colon sub-delimited (and column sub-sub-delimited). It's meant to be used with awk. I can give you json output if you'd like to format it yourself.
This format is basically tab-delimited, with a few columns that are semi-colon sub-delimited (and column sub-sub-delimited). It's meant to be used with awk. I can give you json output if you'd like to format it yourself.


[[Category:Programmatic access]]
[[Category:Tutorials]]
[[Category:ZINC]]
[[Category:ZINC]]
[[Category:API]]
[[Category:NEED ATTENTION]]

Latest revision as of 21:08, 11 March 2020

You can query ZINC using curl as follows:

Download all the IDs in a file

curl -s --data-urlencode zinc.ids="( cat test.list )" -d page.format=properties http://zinc.docking.org/results


Suppose your file is big, then use the following script to chunk it up by 1000 molecules at a time:

split -l 1000 test.list
foreach i (x??)
       curl -o $i.mol2 --data-urlencode zinc.ids="(cat $i)" -d page.format=mol2 http://zinc.docking.org/results
end

Get all the information about an ID (March 2020)

curl http://zinc15.docking.org/substances.txt  -F zinc_id-in=@./zinc.txt -F output_fields="zinc_id mwt num_atoms num_rings"

Output fields can be found at http://zinc15.docking.org/substances/help/ zinc.txt should look like:

7 
10
ZINC92349234

Get all the information about an ID

This includes target annotations and such, using the "800-lb-gorilla" format

curl -s --data-urlencode zinc.ids="$( cat test.list )" -d page.format=800-lb-gorilla http://zinc.docking.org/results

This format is basically tab-delimited, with a few columns that are semi-colon sub-delimited (and column sub-sub-delimited). It's meant to be used with awk. I can give you json output if you'd like to format it yourself.