Analyzing DOCK Results
Analyzing DOCK results
The dock37tools in $d37 contain various analysis programs. Once your jobs are done, you can run:
If your ran a prospective run, it can be advantageous to run extract_all.py and ignore bad poses with scores greater than -20.0 (adjust for your system), like this:
$d37/extract_all.py -s -20.0
This may take awhile but it will pull all your results into a single file, etc. If you want to calculate enrichment, etc.:
$d37/enrich.py -l ligand-file -d decoy-file
Where the ligand-file and decoy-file are single column files with the ligand and decoy IDs on individual lines. Plotting is also possible
$d37/plots.py -i . -l label --ligand-file=ligand-file -d decoy-file
Common usage is to plot several different runs on a single plot like so:
$d37/plots.py -i run1 -l label1 -i run2 -l label2 --ligand-file=ligand-file -d decoy-file
If you want to compare the scores from two runs, try:
$d37/two_run_plot.py run1 run2
Of course, the plots must be run on a machine with the proper libraries installed, like sgehead.
Another common use is to look at top poses in the ViewDock module of UCSF Chimera or with PyMOL. You can make a mol2 output file that can be read by these programs with the following command:
The defaults on this script are to make a poses.mol2 file with the top 500 poses from the entire run, with a single pose per molecule ID. There are many options which can be seen with the "-h" flag. A more complex example is:
$d37/getposes.py -z -l 1000 -x 2 -f ligands.txt -o ligands.1000.mol2
In order, the '-z' flag connects to ZINC for vendor information, the "-l 1000" flag only gets the first 1000 ligands in the file, '-x 2' gets the top 2 poses, the '-f ligands.txt' file designates the ligand file to use and '-o ligands.1000.mol2' designates the output filename.
If you're curious about the OUTDOCK file format, here is the header:
mol# id_num flexiblecode matched nscored time hac setnum matnum rank cloud elect + vdW + psol + asol + inter + rec_e + rec_d + r_hyd = Total mol# is just the number of the molecule, read in from the docking db2 files. id_num is the ZINC code or other identifier for the molecule flexiblecode is the combination of flexible receptor parts this molecule was docked to matched is the number of matched orientations actually found by the matching algorithm nscored is the number of atoms that were scored time is the time in seconds for this molecule hac is the heavy atom count for this ligand setnum is the conformation number this ligand represents matnum is the match number this ligand represents rank is the rank of the score for this ligand within the ligand (if you want the top 10 poses, this number will increase from 1 to 10) cloud is the cloud number, for an experimental matching scheme still under development electrostatics is the electrostatics score vdW is the van der Waals score (both attractive and repulsive together) psol is the ligand polar desolvation asol is the ligand apolar desolvation inter is the internal energy rec_e is the receptor energy (used in flexible docking) rec_d is the receptor desolvation, not yet supported r_hyd is the receptor hydrophobic effect, not yet supported Total is the total score for this ligand pose
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ This page is adapted from "DOCK3.7 Documentation" by Ryan G. Coleman. Based on a work at https://sites.google.com/site/dock37wiki/.