Rescoring with DOCK 3.7
We often want to get the score for a molecule without doing any docking.
DOCK 3.7 now can do this internally. In DOCK 3.6 this was done in an exteranl program scoreopt, which is no longer used.
needed files
To rescore you need 3 files:
poses.mol2.gz amsol.txt.gz vdw.txt.gz
how to generate need files
Currently, the format of the mol2 file is very rigid. It must be in the same format as mol2s produced by DOCK 3.7. The script convert_anyMol2_to_dockMol2.py should convert mol2 files into the right format.
Here is a tarball with all the scripts you will need for rescoring (this will likely be provided in a future release of the code):
rescoring.tar.gz
Here is what is in the tarball:
drwxr-xr-x tbalius/bks 0 2018-09-26 08:28 rescoring/ -rw-r--r-- tbalius/bks 429 2018-09-26 08:28 rescoring/1.run.rescore_prep.csh -rw-r--r-- tbalius/bks 575 2018-09-26 08:28 rescoring/mol2toDOCK37type.py -rw-r--r-- tbalius/bks 1757 2018-09-26 08:28 rescoring/2.rescore_get_parms_rerun_mod.csh -rw-r--r-- tbalius/bks 1725 2018-09-26 08:24 rescoring/convert_anyMol2_to_dockMol2.py -rw-r--r-- tbalius/bks 32030 2018-09-26 08:21 rescoring/mol2.py -rw-r--r-- tbalius/bks 3074 2018-09-26 08:19 rescoring/separate_mol2_more10000.py
The following script will process a mol2 file produced by dock for rescoring.
1.run.rescore_prep.csh
Here is the script:
#rm poses.mol2.gz vdw.txt.gz amsol.txt.gz # #zcat test.mol2.gz >! poses.mol2 set ligs_mol2 = $1 #if $ligs_mol2:e == 'gz' then # echo $ligs_mol2 $ligs_mol2:r $ligs_mol2:e cp $ligs_mol2 poses.mol2 #csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 noamsol csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 amsol gzip -f poses.mol2 gzip -f vdw.txt gzip -f amsol.txt
Here is a script that will generate the amsol and vdw files from a mol2 file:
2.rescore_get_parms_rerun_mod.csh
Here is the script:
set mol2file = $1 set ifamsol = $2 set list = `awk '/ Name:/{print $3}' $mol2file` rm vdw.txt amsol.txt touch vdw.txt amsol.txt # (1) braekup mol2 file. # python /nfs/home/tbalius/zzz.scripts/separate_mol2_more10000.py $mol2file mol # foreach molecule foreach mol2 (`ls mol*.mol2`) set name = $mol2:r echo $mol2 rm -r $name mkdir $name cd $name cp ../$mol2 . # (2) mape vdw parms on to the atomtypes python /nfs/home/tbalius/zzz.scripts/mol2toDOCK37type.py $mol2 vdw.txt #ls -lt | head # (3) run amsol if ($ifamsol == 'amsol') then csh /nfs/home/tbalius/zzz.github/DOCK/ligand/amsol/calc_solvation.csh $mol2 awk 'BEGIN{count=0}{if(count>0){printf"%s %s %s %s\n", $2, $4, $5, $3}; count=count+1}' output.solv >! output.solv2 else if ($ifamsol == 'noamsol') then echo "amsol is not calculated." else echo "ERROR. . . " exit endif cd ../ echo "########$name########" >> vdw.txt cat $name/vdw.txt >> vdw.txt #paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %-6s %5s %5s %5s %5s\n", $1, $2, $3, $5, $6, $7, $8}' >> amsol.txt if ($ifamsol == 'amsol') then echo "########$name########" >> amsol.txt paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt else cat vdw.txt | awk '{if(NF==1){print $0} else if(NF==4){printf ("%2d %3s %5.2f %5.2f %5.2f %5.2f\n", $1, $2, 0.0,0.0,0.0,0.0)}}' >! amsol.txt endif # end
It will generate the amsol file by reruning amsol using the docked poses.
You could download the amsol file for the promoter of interest from zinc15. for example:
curl http://files.docking.org/protomers/08/06/14/455080614.solv > output.solv2
process it for dock:
echo "########$name########" >> amsol.txt paste vdw.txt output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt
It is also possible to get the amsol parameters from the db2 files:
/mnt/nfs/work/tbalius/Water_Project_newgrid_mod_heme_charge/0008.rescore_get_parms_from_db_mod.csh
This is a bit messy and slow.
Here is the script:
set mol2file = $1 ## dock3.7 output file #set ZINCID = $1 #set db2file = $2 set dbpath = $2 #echo $ZINCID #echo $db2file set list = `awk '/ Name:/{print $3}' $mol2file` rm vdw.txt amsol.txt touch vdw.txt amsol.txt foreach ZINCID ($list) echo $ZINCID # get the number of atoms awk 'BEGIN{flag=0}{if (flag == 1){print "atomnum="$1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}' $mol2file # print the number of atoms # line after zinc id set atomnum = `awk 'BEGIN{flag=0}{if (flag == 1){print $1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}' $mol2file` # print the number of atoms # line after zinc id set db2file = `grep -a20 $ZINCID $mol2file | grep "Ligand Source File:" | awk '{print $5}' | sort | uniq ` echo $db2file echo $dbpath/$db2file #zcat $db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; print "atomnum="$4 "::" $0; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' #exit zcat $dbpath/$db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' > ! $ZINCID.parms.txt #zcat $db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && flag=="False"){flag="True"; count=count+1; print "found '$ZINCID'"};if(($1 == "A") && (flag=="True") ){print count":"$0}}' # this will only return the first ZINC ID incountered. echo "## $ZINCID parms" >> vdw.txt echo "## $ZINCID parms" >> amsol.txt # make vdw file grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' >> vdw.txt #awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' $ZINCID.parms.txt >> vdw.txt # amsol file grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %6.3f %6.3f %6.3f %6.3f\n", $2, $3, $8, $9, $10, $11}' >> amsol.txt #awk '{printf "%2d %3s %6.3f %6.3f %6.3f %6.3f\n", $2, $3, $8, $9, $10, $11}' $ZINCID.parms.txt >> amsol.txt end #ZINCID
INDOCK Parameters
Here is the parameters in the INDOCK file:
DOCK 3.7 parameter ##################################################### ### NOTE: split_database_index is reserved to specify a list of files search_type 2 mol2file poses.mol2.gz ligsolfile amsol.txt.gz ligvdwfile vdw.txt.gz ##################################################### # NOTE: split_database_index is reserved to specify a list of files ligand_atom_file split_database_index
note that the split_database_index file is not used it is just a place holder.