Difference between revisions of "Rescoring with DOCK 3.7"

From DISI
Jump to: navigation, search
(Created page with " We often want to get the score for a molecule without doing any docking. DOCK3.7 now can do this internally. In DOCK 3.6 this was done in an exteranl program scoreopt. =...")
 
(how to generate need files)
Line 11: Line 11:
  
 
== how to generate need files ==
 
== how to generate need files ==
 
+
   1.run.rescore_prep.csh
   run.rescore_prep.csh
+
Here is the script:
 +
 
 +
#rm poses.mol2.gz vdw.txt.gz amsol.txt.gz
 +
#
 +
#zcat test.mol2.gz >! poses.mol2
 +
 +
set ligs_mol2 = $1
 +
 +
 +
#if $ligs_mol2:e == 'gz' then
 +
#  echo $ligs_mol2 $ligs_mol2:r $ligs_mol2:e
 +
 +
cp $ligs_mol2 poses.mol2
 +
 +
#csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 noamsol
 +
csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 amsol
 +
gzip -f poses.mol2
 +
gzip -f vdw.txt
 +
gzip -f amsol.txt
 +
  
 
Here is a script that will generate the amsol and vdw files from a mol2 file:  
 
Here is a script that will generate the amsol and vdw files from a mol2 file:  
  
   0008.rescore_get_parms_rerun_mod.csh
+
   2.rescore_get_parms_rerun_mod.csh
 +
 
 +
Here is the script:
 +
 
 +
 +
set mol2file = $1
 +
set ifamsol  = $2
 +
 +
set list = `awk '/  Name:/{print $3}' $mol2file`
 +
rm vdw.txt amsol.txt
 +
touch vdw.txt amsol.txt
 +
 +
# (1) braekup mol2 file. 
 +
#
 +
  python /nfs/home/tbalius/zzz.scripts/separate_mol2_more10000.py $mol2file mol
 +
# foreach molecule
 +
  foreach mol2 (`ls mol*.mol2`)
 +
    set name = $mol2:r
 +
    echo $mol2
 +
    rm -r $name
 +
    mkdir $name
 +
    cd $name
 +
    cp ../$mol2 .
 +
 +
# (2) mape vdw parms on to the atomtypes
 +
    python /nfs/home/tbalius/zzz.scripts/mol2toDOCK37type.py $mol2 vdw.txt
 +
    #ls -lt | head
 +
 +
# (3) run amsol
 +
    if ($ifamsol == 'amsol') then
 +
        csh /nfs/home/tbalius/zzz.github/DOCK/ligand/amsol/calc_solvation.csh $mol2
 +
        awk 'BEGIN{count=0}{if(count>0){printf"%s %s %s %s\n", $2, $4, $5, $3}; count=count+1}' output.solv >! output.solv2
 +
    else if ($ifamsol == 'noamsol') then
 +
        echo "amsol is not calculated."
 +
    else
 +
        echo "ERROR. . . "
 +
        exit
 +
    endif 
 +
    cd ../
 +
    echo "########$name########" >> vdw.txt
 +
    cat $name/vdw.txt >> vdw.txt
 +
 +
    #paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %-6s %5s %5s %5s %5s\n", $1, $2, $3, $5, $6, $7, $8}' >> amsol.txt
 +
    if ($ifamsol == 'amsol') then
 +
        echo "########$name########" >> amsol.txt
 +
        paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt
 +
    else
 +
        cat vdw.txt | awk '{if(NF==1){print $0} else if(NF==4){printf ("%2d %3s %5.2f %5.2f %5.2f %5.2f\n", $1, $2, 0.0,0.0,0.0,0.0)}}' >! amsol.txt
 +
    endif
 +
#
 +
  end
  
 
It will generate the amsol file by reruning amsol using the docked poses.  
 
It will generate the amsol file by reruning amsol using the docked poses.  
  
 
It is also possible to get the amsol parameters from the db2 files:
 
It is also possible to get the amsol parameters from the db2 files:
 +
 
   /mnt/nfs/work/tbalius/Water_Project_newgrid_mod_heme_charge/0008.rescore_get_parms_from_db_mod.csh
 
   /mnt/nfs/work/tbalius/Water_Project_newgrid_mod_heme_charge/0008.rescore_get_parms_from_db_mod.csh
 +
 +
This is a bit messy and slow. 
 +
 +
Here is the script:
 +
 +
set mol2file = $1 ## dock3.7 output file
 +
#set ZINCID = $1
 +
#set db2file = $2
 +
set dbpath = $2
 +
 +
#echo $ZINCID
 +
#echo $db2file
 +
 +
set list = `awk '/  Name:/{print $3}' $mol2file`
 +
rm vdw.txt amsol.txt
 +
touch vdw.txt amsol.txt
 +
 +
foreach ZINCID ($list)
 +
 +
  echo $ZINCID
 +
  # get the number of atoms
 +
  awk 'BEGIN{flag=0}{if (flag == 1){print "atomnum="$1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}'  $mol2file # print the number of atoms # line after zinc id
 +
  set atomnum = `awk 'BEGIN{flag=0}{if (flag == 1){print $1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}'  $mol2file` # print the number of atoms # line after zinc id
 +
 +
  set db2file = `grep -a20 $ZINCID  $mol2file | grep "Ligand Source File:" | awk '{print $5}' | sort | uniq `
 +
  echo $db2file
 +
  echo $dbpath/$db2file
 +
  #zcat $db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; print "atomnum="$4 "::" $0; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}'
 +
  #exit
 +
  zcat $dbpath/$db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' > ! $ZINCID.parms.txt
 +
  #zcat $db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && flag=="False"){flag="True"; count=count+1; print "found '$ZINCID'"};if(($1 == "A") && (flag=="True") ){print count":"$0}}'
 +
    # this will only return the first ZINC ID incountered.
 +
 +
  echo "## $ZINCID parms" >> vdw.txt
 +
  echo "## $ZINCID parms" >> amsol.txt
 +
 +
  # make vdw file
 +
  grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' >> vdw.txt
 +
  #awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' $ZINCID.parms.txt >> vdw.txt
 +
  # amsol file
 +
  grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s  %6.3f    %6.3f    %6.3f    %6.3f\n", $2, $3, $8, $9, $10, $11}' >> amsol.txt
 +
  #awk '{printf "%2d %3s  %6.3f    %6.3f    %6.3f    %6.3f\n", $2, $3, $8, $9, $10, $11}' $ZINCID.parms.txt >> amsol.txt
 +
end #ZINCID
  
 
== INDOCK Parameters ==
 
== INDOCK Parameters ==

Revision as of 11:27, 29 November 2017

We often want to get the score for a molecule without doing any docking.

DOCK3.7 now can do this internally. In DOCK 3.6 this was done in an exteranl program scoreopt.

need files

To rescore you need 3 files:

   poses.mol2.gz
   amsol.txt.gz
   vdw.txt.gz

how to generate need files

  1.run.rescore_prep.csh

Here is the script:

#rm poses.mol2.gz vdw.txt.gz amsol.txt.gz
#
#zcat test.mol2.gz >! poses.mol2

set ligs_mol2 = $1


#if $ligs_mol2:e == 'gz' then
#   echo $ligs_mol2 $ligs_mol2:r $ligs_mol2:e 

cp $ligs_mol2 poses.mol2

#csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 noamsol
csh 2.rescore_get_parms_rerun_mod.csh poses.mol2 amsol
gzip -f poses.mol2
gzip -f vdw.txt
gzip -f amsol.txt

Here is a script that will generate the amsol and vdw files from a mol2 file:

  2.rescore_get_parms_rerun_mod.csh

Here is the script:


set mol2file = $1 
set ifamsol  = $2

set list = `awk '/  Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt

# (1) braekup mol2 file.  
# 
  python /nfs/home/tbalius/zzz.scripts/separate_mol2_more10000.py $mol2file mol 
# foreach molecule
  foreach mol2 (`ls mol*.mol2`)
    set name = $mol2:r
    echo $mol2
    rm -r $name 
    mkdir $name
    cd $name
    cp ../$mol2 .

# (2) mape vdw parms on to the atomtypes
    python /nfs/home/tbalius/zzz.scripts/mol2toDOCK37type.py $mol2 vdw.txt
    #ls -lt | head

# (3) run amsol
    if ($ifamsol == 'amsol') then 
       csh /nfs/home/tbalius/zzz.github/DOCK/ligand/amsol/calc_solvation.csh $mol2
       awk 'BEGIN{count=0}{if(count>0){printf"%s %s %s %s\n", $2, $4, $5, $3}; count=count+1}' output.solv >! output.solv2
    else if ($ifamsol == 'noamsol') then
       echo "amsol is not calculated."
    else 
       echo "ERROR. . . "
       exit
    endif  
    cd ../
    echo "########$name########" >> vdw.txt
    cat $name/vdw.txt >> vdw.txt 

    #paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %-6s %5s %5s %5s %5s\n", $1, $2, $3, $5, $6, $7, $8}' >> amsol.txt
    if ($ifamsol == 'amsol') then
       echo "########$name########" >> amsol.txt
       paste $name/vdw.txt $name/output.solv2 | awk '{printf"%2s %3s %5s %5s %5s %5s\n", $1, $2, $5, $6, $7, $8}' >> amsol.txt
    else
       cat vdw.txt | awk '{if(NF==1){print $0} else if(NF==4){printf ("%2d %3s %5.2f %5.2f %5.2f %5.2f\n", $1, $2, 0.0,0.0,0.0,0.0)}}' >! amsol.txt 
    endif 
#
  end

It will generate the amsol file by reruning amsol using the docked poses.

It is also possible to get the amsol parameters from the db2 files:

  /mnt/nfs/work/tbalius/Water_Project_newgrid_mod_heme_charge/0008.rescore_get_parms_from_db_mod.csh

This is a bit messy and slow.

Here is the script:

set mol2file = $1 ## dock3.7 output file
#set ZINCID = $1
#set db2file = $2
set dbpath = $2

#echo $ZINCID
#echo $db2file

set list = `awk '/  Name:/{print $3}' $mol2file`
rm vdw.txt amsol.txt
touch vdw.txt amsol.txt

foreach ZINCID ($list)

  echo $ZINCID
  # get the number of atoms 
  awk 'BEGIN{flag=0}{if (flag == 1){print "atomnum="$1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}'  $mol2file # print the number of atoms # line after zinc id
  set atomnum = `awk 'BEGIN{flag=0}{if (flag == 1){print $1;flag=0} if ($1 == "'$ZINCID'"){flag = 1}}'  $mol2file` # print the number of atoms # line after zinc id

  set db2file = `grep -a20 $ZINCID  $mol2file | grep "Ligand Source File:" | awk '{print $5}' | sort | uniq `
  echo $db2file
  echo $dbpath/$db2file
  #zcat $db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; print "atomnum="$4 "::" $0; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' 
  #exit
  zcat $dbpath/$db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && $4 == "'$atomnum'" && flag=="False"){flag="True"; count=count+1};if (($1 == "A") && flag=="True"){print count":"$0}}' > ! $ZINCID.parms.txt
  #zcat $db2file | awk 'BEGIN{count=0} /M    /{flag="False"};{if($2 =="'$ZINCID'" && flag=="False"){flag="True"; count=count+1; print "found '$ZINCID'"};if(($1 == "A") && (flag=="True") ){print count":"$0}}' 
   # this will only return the first ZINC ID incountered.

  echo "## $ZINCID parms" >> vdw.txt
  echo "## $ZINCID parms" >> amsol.txt

  # make vdw file
  grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' >> vdw.txt
  #awk '{printf "%2d %3s %-5s %2d\n", $2, $3, $4, $5}' $ZINCID.parms.txt >> vdw.txt
  # amsol file
  grep "^1:" $ZINCID.parms.txt | sed 's/1://g' | awk '{printf "%2d %3s   %6.3f     %6.3f     %6.3f    %6.3f\n", $2, $3, $8, $9, $10, $11}' >> amsol.txt
  #awk '{printf "%2d %3s   %6.3f     %6.3f     %6.3f    %6.3f\n", $2, $3, $8, $9, $10, $11}' $ZINCID.parms.txt >> amsol.txt
end #ZINCID

INDOCK Parameters

Here is the parameters in the INDOCK file:

DOCK 3.7 parameter
#####################################################
### NOTE: split_database_index is reserved to specify a list of files
search_type                   2
mol2file                      poses.mol2.gz
ligsolfile                    amsol.txt.gz
ligvdwfile                    vdw.txt.gz
#####################################################
# NOTE: split_database_index is reserved to specify a list of files
ligand_atom_file               split_database_index

note that the split_database_index file is not used it is just a place holder.