Rescoring with DOCK 3.7
We often want to get the score for a molecule without doing any docking.
DOCK 3.7 now can do this internally. In DOCK 3.6 this was done in an exteranl program scoreopt, which is no longer used.
needed files
To rescore you need 3 files:
poses.mol2.gz amsol.txt.gz vdw.txt.gz
how to generate need files
Currently, the format of the mol2 file is very rigid. It must be in the same format as mol2s produced by DOCK 3.7. The script convert_anyMol2_to_dockMol2.py should convert mol2 files into the right format.
Here is a tarball with all the scripts you will need for rescoring (this will likely be provided in a future release of the code):
rescoring.tar.gz
Here is what is in the tarball:
drwxr-xr-x tbalius/bks 0 2018-09-26 08:28 rescoring/ -rw-r--r-- tbalius/bks 429 2018-09-26 08:28 rescoring/1.run.rescore_prep.csh -rw-r--r-- tbalius/bks 575 2018-09-26 08:28 rescoring/mol2toDOCK37type.py -rw-r--r-- tbalius/bks 1757 2018-09-26 08:28 rescoring/2.rescore_get_parms_rerun_mod.csh -rw-r--r-- tbalius/bks 1725 2018-09-26 08:24 rescoring/convert_anyMol2_to_dockMol2.py -rw-r--r-- tbalius/bks 32030 2018-09-26 08:21 rescoring/mol2.py -rw-r--r-- tbalius/bks 3074 2018-09-26 08:19 rescoring/separate_mol2_more10000.py
The following script will process a mol2 file produced by dock for rescoring.
1.run.rescore_prep.csh
Here is the script:
#rm poses.mol2.gz vdw.txt.gz amsol.txt.gz # #zcat test.mol2.gz >! poses.mol2 set ligs_mol2 = $1 #if $ligs_mol2:e == 'gz' then # echo $ligs_mol2 $ligs_mol2:r $ligs_mol2:e cp $ligs_mol2 poses.mol2 #csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 noamsol csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 amsol gzip -f poses.mol2 gzip -f vdw.txt gzip -f amsol.txt
Here is a script that will generate the amsol and vdw files from a mol2 file:
2.rescore_get_parms_rerun_mod.csh
Here is the script:
set mol2file = $1
set ifamsol = $2
set list = `awk '/ Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt
# (1) braekup mol2 file.
#
python /nfs/home/tbalius/zzz.scripts/separate_mol2_more10000.py $mol2file mol
# foreach molecule
foreach mol2 (`ls mol*.mol2`)
set name = $mol2:r
echo $mol2
rm -r $name
mkdir $name
cd $name
cp ../$mol2 .
# (2) mape vdw parms on to the atomtypes
python /nfs/home/tbalius/zzz.scripts/mol2toDOCK37type.py $mol2 vdw.txt
#ls -lt | head
# (3) run amsol
if ($ifamsol == 'amsol') then
csh /nfs/home/tbalius/zzz.github/DOCK/ligand/amsol/calc_solvation.csh $mol2
awk 'BEGIN{count=0}{if(count>0){printf"%s %s %s %s\n", $2, $4, $5, $3}; count=count+1}' output.solv >! output.solv2
else if ($ifamsol == 'noamsol') then
echo "amsol is not calculated."
else
echo "ERROR. . . "
exit
endif
cd ../
echo "########$name########" >> vdw.txt
cat $name/vdw.txt >> vdw.txt
#paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %-6s %5s %5s %5s %5s\n", $1, $2, $3, $5, $6, $7, $8}' >> amsol.txt
if ($ifamsol == 'amsol') then
echo "########$name########" >> amsol.txt
paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt
else
cat vdw.txt | awk '{if(NF==1){print $0} else if(NF==4){printf ("%2d %3s %5.2f %5.2f %5.2f %5.2f\n", $1, $2, 0.0,0.0,0.0,0.0)}}' >! amsol.txt
endif
#
end
It will generate the amsol file by reruning amsol using the docked poses.
You could download the amsol file for the promoter of interest from zinc15. for example:
curl http://files.docking.org/protomers/08/06/14/455080614.solv > output.solv2
process it for dock:
echo "########$name########" >> amsol.txt
paste vdw.txt output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt
It is also possible to get the amsol parameters from the db2 files:
/mnt/nfs/work/tbalius/Water_Project_newgrid_mod_heme_charge/0008.rescore_get_parms_from_db_mod.csh
This is a bit messy and slow.
Here is the script:
set mol2file = $1 ## dock3.7 output file
#set ZINCID = $1
#set db2file = $2
set dbpath = $2
#echo $ZINCID
#echo $db2file
set list = `awk '/ Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt
foreach ZINCID ($list)
echo $ZINCID
# get the number of atoms
awk 'BEGIN{flag=0}{if (flag == 1){print "atomnum="$1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}' $mol2file # print the number of atoms # line after zinc id
set atomnum = `awk 'BEGIN{flag=0}{if (flag == 1){print $1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}' $mol2file` # print the number of atoms # line after zinc id
set db2file = `grep -a20 $ZINCID $mol2file | grep "Ligand Source File:" | awk '{print $5}' | sort | uniq `
echo $db2file
echo $dbpath/$db2file
#zcat $db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; print "atomnum="$4 "::" $0; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}'
#exit
zcat $dbpath/$db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' > ! $ZINCID.parms.txt
#zcat $db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && flag=="False"){flag="True"; count=count+1; print "found '$ZINCID'"};if(($1 == "A") && (flag=="True") ){print count":"$0}}'
# this will only return the first ZINC ID incountered.
echo "## $ZINCID parms" >> vdw.txt
echo "## $ZINCID parms" >> amsol.txt
# make vdw file
grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' >> vdw.txt
#awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' $ZINCID.parms.txt >> vdw.txt
# amsol file
grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %6.3f %6.3f %6.3f %6.3f\n", $2, $3, $8, $9, $10, $11}' >> amsol.txt
#awk '{printf "%2d %3s %6.3f %6.3f %6.3f %6.3f\n", $2, $3, $8, $9, $10, $11}' $ZINCID.parms.txt >> amsol.txt
end #ZINCID
INDOCK Parameters
Here is the parameters in the INDOCK file:
DOCK 3.7 parameter ##################################################### ### NOTE: split_database_index is reserved to specify a list of files search_type 2 mol2file poses.mol2.gz ligsolfile amsol.txt.gz ligvdwfile vdw.txt.gz ##################################################### # NOTE: split_database_index is reserved to specify a list of files ligand_atom_file split_database_index
note that the split_database_index file is not used it is just a place holder.