Rescoring with DOCK 3.7

From DISI
Revision as of 11:27, 29 November 2017 by TBalius (Talk | contribs)

Jump to: navigation, search

We often want to get the score for a molecule without doing any docking.

DOCK3.7 now can do this internally. In DOCK 3.6 this was done in an exteranl program scoreopt.

need files

To rescore you need 3 files:

   poses.mol2.gz
   amsol.txt.gz
   vdw.txt.gz

how to generate need files

  1.run.rescore_prep.csh

Here is the script:

#rm poses.mol2.gz vdw.txt.gz amsol.txt.gz
#
#zcat test.mol2.gz >! poses.mol2

set ligs_mol2 = $1


#if $ligs_mol2:e == 'gz' then
#   echo $ligs_mol2 $ligs_mol2:r $ligs_mol2:e 

cp $ligs_mol2 poses.mol2

#csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 noamsol
csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 amsol
gzip -f poses.mol2
gzip -f vdw.txt
gzip -f amsol.txt

Here is a script that will generate the amsol and vdw files from a mol2 file:

  2.rescore_get_parms_rerun_mod.csh

Here is the script:


set mol2file = $1 
set ifamsol  = $2

set list = `awk '/  Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt

# (1) braekup mol2 file.  
# 
  python /nfs/home/tbalius/zzz.scripts/separate_mol2_more10000.py $mol2file mol 
# foreach molecule
  foreach mol2 (`ls mol*.mol2`)
    set name = $mol2:r
    echo $mol2
    rm -r $name 
    mkdir $name
    cd $name
    cp ../$mol2 .

# (2) mape vdw parms on to the atomtypes
    python /nfs/home/tbalius/zzz.scripts/mol2toDOCK37type.py $mol2 vdw.txt
    #ls -lt | head

# (3) run amsol
    if ($ifamsol == 'amsol') then 
       csh /nfs/home/tbalius/zzz.github/DOCK/ligand/amsol/calc_solvation.csh $mol2
       awk 'BEGIN{count=0}{if(count>0){printf"%s %s %s %s\n", $2, $4, $5, $3}; count=count+1}' output.solv >! output.solv2
    else if ($ifamsol == 'noamsol') then
       echo "amsol is not calculated."
    else 
       echo "ERROR. . . "
       exit
    endif  
    cd ../
    echo "########$name########" >> vdw.txt
    cat $name/vdw.txt >> vdw.txt 

    #paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %-6s %5s %5s %5s %5s\n", $1, $2, $3, $5, $6, $7, $8}' >> amsol.txt
    if ($ifamsol == 'amsol') then
       echo "########$name########" >> amsol.txt
       paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt
    else
       cat vdw.txt | awk '{if(NF==1){print $0} else if(NF==4){printf ("%2d %3s %5.2f %5.2f %5.2f %5.2f\n", $1, $2, 0.0,0.0,0.0,0.0)}}' >! amsol.txt 
    endif 
#
  end

It will generate the amsol file by reruning amsol using the docked poses.

It is also possible to get the amsol parameters from the db2 files:

  /mnt/nfs/work/tbalius/Water_Project_newgrid_mod_heme_charge/0008.rescore_get_parms_from_db_mod.csh

This is a bit messy and slow.

Here is the script:

set mol2file = $1 ## dock3.7 output file
#set ZINCID = $1
#set db2file = $2
set dbpath = $2

#echo $ZINCID
#echo $db2file

set list = `awk '/  Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt

foreach ZINCID ($list)

  echo $ZINCID
  # get the number of atoms 
  awk 'BEGIN{flag=0}{if (flag == 1){print "atomnum="$1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}'  $mol2file # print the number of atoms # line after zinc id
  set atomnum = `awk 'BEGIN{flag=0}{if (flag == 1){print $1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}'  $mol2file` # print the number of atoms # line after zinc id

  set db2file = `grep -a20 $ZINCID  $mol2file | grep "Ligand Source File:" | awk '{print $5}' | sort | uniq `
  echo $db2file
  echo $dbpath/$db2file
  #zcat $db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; print "atomnum="$4 "::" $0; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' 
  #exit
  zcat $dbpath/$db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' > ! $ZINCID.parms.txt
  #zcat $db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && flag=="False"){flag="True"; count=count+1; print "found '$ZINCID'"};if(($1 == "A") && (flag=="True") ){print count":"$0}}' 
   # this will only return the first ZINC ID incountered.

  echo "## $ZINCID parms" >> vdw.txt
  echo "## $ZINCID parms" >> amsol.txt

  # make vdw file
  grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' >> vdw.txt
  #awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' $ZINCID.parms.txt >> vdw.txt
  # amsol file
  grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s   %6.3f     %6.3f     %6.3f    %6.3f\n", $2, $3, $8, $9, $10, $11}' >> amsol.txt
  #awk '{printf "%2d %3s   %6.3f     %6.3f     %6.3f    %6.3f\n", $2, $3, $8, $9, $10, $11}' $ZINCID.parms.txt >> amsol.txt
end #ZINCID

INDOCK Parameters

Here is the parameters in the INDOCK file:

DOCK 3.7 parameter
#####################################################
### NOTE: split_database_index is reserved to specify a list of files
search_type                   2
mol2file                      poses.mol2.gz
ligsolfile                    amsol.txt.gz
ligvdwfile                    vdw.txt.gz
#####################################################
# NOTE: split_database_index is reserved to specify a list of files
ligand_atom_file               split_database_index

note that the split_database_index file is not used it is just a place holder.