Rescoring with DOCK 3.7: Difference between revisions
| Line 11: | Line 11: | ||
== how to generate need files == | == how to generate need files == | ||
Currently, the format of the mol2 file is very rigid. It must be in the same format as mol2s produced by DOCK 3.7. | |||
The following script will process a mol2 file produced by dock for rescoring. | |||
1.run.rescore_prep.csh | 1.run.rescore_prep.csh | ||
Here is the script: | Here is the script: | ||
Revision as of 18:36, 29 November 2017
We often want to get the score for a molecule without doing any docking.
DOCK3.7 now can do this internally. In DOCK 3.6 this was done in an exteranl program scoreopt.
need files
To rescore you need 3 files:
poses.mol2.gz amsol.txt.gz vdw.txt.gz
how to generate need files
Currently, the format of the mol2 file is very rigid. It must be in the same format as mol2s produced by DOCK 3.7.
The following script will process a mol2 file produced by dock for rescoring.
1.run.rescore_prep.csh
Here is the script:
#rm poses.mol2.gz vdw.txt.gz amsol.txt.gz # #zcat test.mol2.gz >! poses.mol2 set ligs_mol2 = $1 #if $ligs_mol2:e == 'gz' then # echo $ligs_mol2 $ligs_mol2:r $ligs_mol2:e cp $ligs_mol2 poses.mol2 #csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 noamsol csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 amsol gzip -f poses.mol2 gzip -f vdw.txt gzip -f amsol.txt
Here is a script that will generate the amsol and vdw files from a mol2 file:
2.rescore_get_parms_rerun_mod.csh
Here is the script:
set mol2file = $1
set ifamsol = $2
set list = `awk '/ Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt
# (1) braekup mol2 file.
#
python /nfs/home/tbalius/zzz.scripts/separate_mol2_more10000.py $mol2file mol
# foreach molecule
foreach mol2 (`ls mol*.mol2`)
set name = $mol2:r
echo $mol2
rm -r $name
mkdir $name
cd $name
cp ../$mol2 .
# (2) mape vdw parms on to the atomtypes
python /nfs/home/tbalius/zzz.scripts/mol2toDOCK37type.py $mol2 vdw.txt
#ls -lt | head
# (3) run amsol
if ($ifamsol == 'amsol') then
csh /nfs/home/tbalius/zzz.github/DOCK/ligand/amsol/calc_solvation.csh $mol2
awk 'BEGIN{count=0}{if(count>0){printf"%s %s %s %s\n", $2, $4, $5, $3}; count=count+1}' output.solv >! output.solv2
else if ($ifamsol == 'noamsol') then
echo "amsol is not calculated."
else
echo "ERROR. . . "
exit
endif
cd ../
echo "########$name########" >> vdw.txt
cat $name/vdw.txt >> vdw.txt
#paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %-6s %5s %5s %5s %5s\n", $1, $2, $3, $5, $6, $7, $8}' >> amsol.txt
if ($ifamsol == 'amsol') then
echo "########$name########" >> amsol.txt
paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt
else
cat vdw.txt | awk '{if(NF==1){print $0} else if(NF==4){printf ("%2d %3s %5.2f %5.2f %5.2f %5.2f\n", $1, $2, 0.0,0.0,0.0,0.0)}}' >! amsol.txt
endif
#
end
It will generate the amsol file by reruning amsol using the docked poses.
It is also possible to get the amsol parameters from the db2 files:
/mnt/nfs/work/tbalius/Water_Project_newgrid_mod_heme_charge/0008.rescore_get_parms_from_db_mod.csh
This is a bit messy and slow.
Here is the script:
set mol2file = $1 ## dock3.7 output file
#set ZINCID = $1
#set db2file = $2
set dbpath = $2
#echo $ZINCID
#echo $db2file
set list = `awk '/ Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt
foreach ZINCID ($list)
echo $ZINCID
# get the number of atoms
awk 'BEGIN{flag=0}{if (flag == 1){print "atomnum="$1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}' $mol2file # print the number of atoms # line after zinc id
set atomnum = `awk 'BEGIN{flag=0}{if (flag == 1){print $1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}' $mol2file` # print the number of atoms # line after zinc id
set db2file = `grep -a20 $ZINCID $mol2file | grep "Ligand Source File:" | awk '{print $5}' | sort | uniq `
echo $db2file
echo $dbpath/$db2file
#zcat $db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; print "atomnum="$4 "::" $0; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}'
#exit
zcat $dbpath/$db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' > ! $ZINCID.parms.txt
#zcat $db2file | awk 'BEGIN{count=0} /M /{flag="False"};{if($2 =="'$ZINCID'" && flag=="False"){flag="True"; count=count+1; print "found '$ZINCID'"};if(($1 == "A") && (flag=="True") ){print count":"$0}}'
# this will only return the first ZINC ID incountered.
echo "## $ZINCID parms" >> vdw.txt
echo "## $ZINCID parms" >> amsol.txt
# make vdw file
grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' >> vdw.txt
#awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' $ZINCID.parms.txt >> vdw.txt
# amsol file
grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %6.3f %6.3f %6.3f %6.3f\n", $2, $3, $8, $9, $10, $11}' >> amsol.txt
#awk '{printf "%2d %3s %6.3f %6.3f %6.3f %6.3f\n", $2, $3, $8, $9, $10, $11}' $ZINCID.parms.txt >> amsol.txt
end #ZINCID
INDOCK Parameters
Here is the parameters in the INDOCK file:
DOCK 3.7 parameter ##################################################### ### NOTE: split_database_index is reserved to specify a list of files search_type 2 mol2file poses.mol2.gz ligsolfile amsol.txt.gz ligvdwfile vdw.txt.gz ##################################################### # NOTE: split_database_index is reserved to specify a list of files ligand_atom_file split_database_index
note that the split_database_index file is not used it is just a place holder.