DOCK 3.7 2016/09/16 Tutorial for Enrichment Calculations (Trent & Jiankun)

From DISI
Jump to: navigation, search

This tutorial uses the DOCK-3.7-beta3. This is a supplemental tutorial for DOCK 3.7 2015/04/15 abl1 Tutorial. Former one works well and is very simple and clear. This tutorial mentions some updates in job submissions.

This is for a Linux environment and the scripts assume that you are running on SGE queueing system.

Submit an enrichment calculation via new 0003.lig-decoy_enrichment.csh

In this section we present a modified version of script here [[1]].

Here we modify the script to use an array job rather than submit every chuck individually as was done previously.

  • Go to the working directory of your account
cd /nfs/home/$USERNAME/work/DOCK_tutorial

PS: $USERNAME is your username log on the cluster.

The hierarchy of the working directory:

DOCK_tutorial ----- 2HYY ------ dockfiles
              |          |
              |          |----- working
              |          |
              |          ------ other files generated by 0001.be_balsti_py.csh
              |
              ----- databases ------ decoys
                              |
                              |----- ligands
                              |
                              ------ ligands.ism and decoys.ism files
  • Write a file called 0003_new.lig-decoy_enrichment.csh
#!/bin/csh

#This script docks a DUD-e like ligand-decoy-database to evaluate the enrichment performance of actives over decoys
#It assumes that ligands and decoys have been pre-prepation (see script blablabla_ToDo) which needs to be run in SF.

# filedir is where your rec.pdb and xtal-lig.pdb and dockfiles directory live 
set filedir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial"  #CHANGE THIS
# this is where the work is done:
set mountdir = $filedir                         # Might CHANGE THIS
set dude_dir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial/databases"  # should contain decoy.smi and ligand.smi for ROC script 00005...csh
  ## TO DO - rename this outside in the dir structure and call in blbalbalbabla script
if (-s $dude_dir) then
  echo " $dude_dir exist"
else
  # this is something to modified in future. 
  # probably better to exit if it is not there.
  echo "databases do not exist. "
  echo "consider making a symbolic link to the database files"
  #echo "making a symbolic link:"
  #echo "ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir"
  #ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir
endif

# change if you want to use a different or consistent dock version
set dock = ${DOCKBASE}/docking/DOCK/bin/dock64
#set dock = ${DOCKBASE}/docking/submit/submit.csh

set list = "2HYY"
#set list = `cat $1`
#set list = `cat file`
                               # CHANGE THIS (pdbname)
foreach pdbname ( $list )

# creates "ligands" and "decoys" and has the aim to dock all of the subsets for those two
foreach db_type ( "ligands" "decoys" )

set workdir1 = "${mountdir}/${pdbname}/ligands-decoys/${db_type}"

echo $mountdir
echo $workdir1

#exit

mkdir -p  ${workdir1}
cd  ${workdir1}
# puts dockfiles in the right relative-path that INDOCK file expects
ln -s $filedir/${pdbname}/dockfiles .

set count = '1'

# loop over database files to put each into a seperate chunk
foreach dbfile (`ls $dude_dir/${db_type}/${db_type}*.db2.gz`)

echo $dbfile

set chunk = "chunk$count"

set workdir2 = ${workdir1}/$chunk

## so you don't blow away stuff
if ( -s $workdir2 ) then
   echo "$workdir2 exits"
   continue
endif

#rm -rf ${workdir}
mkdir -p ${workdir2}
cd ${workdir2}

# copy INDOCK file of choice in right location
#cp $filedir/zzz.dock3_input/INDOCK . 
#cp $filedir/INDOCK_match20K INDOCK
#cp $filedir/INDOCK_5k_TolerantClash INDOCK     # CHANGE THIS

cp $filedir/${pdbname}/INDOCK .
 # modified the dock file using sed. here we change some key sampling parameters; sed -i changes input file internally (overwrites), -e changes file externally (pipes it to screen or into file if redirected)
#sed -i "s/bump_maximum                  50.0/bump_maximum                  500.0/g" INDOCK 
#sed -i "s/bump_rigid                    50.0/bump_rigid                    500.0/g" INDOCK 
#sed -i "s/check_clashes                 yes/check_clashes                 no/g" INDOCK 

ln -s $dbfile .

set dbf = `ls *.gz`

echo "./$dbf"

# says what to dock and where it sits
echo "./$dbf" > split_database_index

@ count = ${count} + 1
# counter is chuch dir

end # dbfile

echo ${workdir1}
cd  ${workdir1}
@ count = ${count} - 1

# writes submission script that runs dock on the sgehead queue
cat <<EOF > DOCKING_${db_type}.csh
#\$ -S /bin/csh
#\$ -cwd
#\$ -q all.q
#\$ -o stdout
#\$ -e stderr

set chunk = "chunk\$SGE_TASK_ID"
set workdir2 = ${workdir1}/\$chunk
cd \${workdir2}
echo "starting . . ." >! stderr
date >> stderr
echo $dock >> stderr
$dock >>& stderr
date >> stderr
echo "finished . . ." >> stderr

EOF


qsub -t 1-$count DOCKING_${db_type}.csh
# This command submits an array job. 

end # db_type
end # pdbname
  • Run the above script.
csh 0003_new.lig-decoy_enrichment.csh

The 0003_new.lig-decoy_enrichment.csh script above will ONLY submit one job but with 1-many subjobs.

Submit an enrichment calculation via 0003.lig-decoy_enrichment_submit.csh

We recommend using this method, rather than the earlier scripts, as it uses the DOCK submission infrastructure.

  • Write a file called 0003.lig-decoy_enrichment_submit.csh
#!/bin/csh

#This script provides a alternative way to dock a DUD-e like ligand-decoy-database for the enrichment evaluation of actives over decoys
#It assumes that ligands and decoys have been pre-prepation (see script blablabla_ToDo) which needs to be run in SF.

set filedir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial"  #CHANGE THIS
# this is where the work is done:
set mountdir = $filedir                         # Might CHANGE THIS
set dude_dir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial/databases"  # should contain decoy.smi and ligand.smi for ROC script 00005...csh
  ## TO DO - rename this outside in the dir structure and call in blbalbalbabla script
if (-s $dude_dir) then
 echo " $dude_dir exist"
else
 # this is something to modified in future. 
 # probably better to exit if it is not there.
 echo "databases do not exist. "
 echo "consider making a symbolic link to the database files"
endif

set list = "2HYY"  # CHANGE THIS (pdbname)
foreach pdbname ( $list )
# creates "ligands" and "decoys" and has the aim to dock all of the subsets for those two
foreach db_type ( "ligands" "decoys" )
set workdir1 = "${mountdir}/${pdbname}/${db_type}"
set workdir2 = "${mountdir}/${pdbname}"
#
echo $mountdir
echo $workdir1
echo $workdir2
#
mkdir -p  ${workdir1}
cd  ${workdir1}
#creat dirlist for *.db2.gz files prepared for docking
ls ${dude_dir}/${db_type}/*.db2.gz > ${db_type}_files.txt
#copy the files needed for dock
cp ${workdir2}/INDOCK ${workdir1}
ln -s ${workdir2}/dockfiles/ ${workdir1}
#use dirlist to creat chunks for job submission
python /nfs/home/tbalius/zzz.github/DOCK/docking/setup/setup_db2_zinc15_file_number.py ./ chunk ./${db_type}_files.txt 500  count
#
csh $DOCKBASE/docking/submit/submit.csh

end # db_type
end # pdbname
  • Run the above script
csh 0003.lig-decoy_enrichment_submit.csh