DOCK 3.7 2016/09/16 Tutorial for Enrichment Calculations (Trent & Jiankun)

From DISI
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

This tutorial uses the DOCK-3.7-beta3. This is a supplemental tutorial for DOCK 3.7 2015/04/15 abl1 Tutorial. Former one works well and is very simple and clear. This tutorial mentions some updates in job submissions.

This is for a Linux environment and the scripts assume that you are running on SGE queueing system.

Submit an enrichment calculation via new 0003.lig-decoy_enrichment.csh

In this section we present a modified version of script here [[1]].

Here we modify the script to use an array job rather than submit every chuck individually as was done previously.

  • Go to the working directory of your account
cd /nfs/home/$USERNAME/work/DOCK_tutorial

PS: $USERNAME is your username log on the cluster.

The hierarchy of the working directory:

DOCK_tutorial ----- 2HYY ------ dockfiles
              |          |
              |          |----- working
              |          |
              |          ------ other files generated by 0001.be_balsti_py.csh
              |
              ----- databases ------ decoys
                              |
                              |----- ligands
                              |
                              ------ ligands.ism and decoys.ism files
  • Write a file called 0003_new.lig-decoy_enrichment.csh
#!/bin/csh

#This script docks a DUD-e like ligand-decoy-database to evaluate the enrichment performance of actives over decoys
#It assumes that ligands and decoys have been pre-prepation (see script blablabla_ToDo) which needs to be run in SF.

# filedir is where your rec.pdb and xtal-lig.pdb and dockfiles directory live 
set filedir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial"  #CHANGE THIS
# this is where the work is done:
set mountdir = $filedir                         # Might CHANGE THIS
set dude_dir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial/databases"  # should contain decoy.smi and ligand.smi for ROC script 00005...csh
  ## TO DO - rename this outside in the dir structure and call in blbalbalbabla script
if (-s $dude_dir) then
  echo " $dude_dir exist"
else
  # this is something to modified in future. 
  # probably better to exit if it is not there.
  echo "databases do not exist. "
  echo "consider making a symbolic link to the database files"
  #echo "making a symbolic link:"
  #echo "ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir"
  #ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir
endif

# change if you want to use a different or consistent dock version
set dock = ${DOCKBASE}/docking/DOCK/bin/dock64
#set dock = ${DOCKBASE}/docking/submit/submit.csh

set list = "2HYY"
#set list = `cat $1`
#set list = `cat file`
                               # CHANGE THIS (pdbname)
foreach pdbname ( $list )

# creates "ligands" and "decoys" and has the aim to dock all of the subsets for those two
foreach db_type ( "ligands" "decoys" )

set workdir1 = "${mountdir}/${pdbname}/ligands-decoys/${db_type}"

echo $mountdir
echo $workdir1

#exit

mkdir -p  ${workdir1}
cd  ${workdir1}
# puts dockfiles in the right relative-path that INDOCK file expects
ln -s $filedir/${pdbname}/dockfiles .

set count = '1'

# loop over database files to put each into a seperate chunk
foreach dbfile (`ls $dude_dir/${db_type}/${db_type}*.db2.gz`)

echo $dbfile

set chunk = "chunk$count"

set workdir2 = ${workdir1}/$chunk

## so you don't blow away stuff
if ( -s $workdir2 ) then
   echo "$workdir2 exits"
   continue
endif

#rm -rf ${workdir}
mkdir -p ${workdir2}
cd ${workdir2}

# copy INDOCK file of choice in right location
#cp $filedir/zzz.dock3_input/INDOCK . 
#cp $filedir/INDOCK_match20K INDOCK
#cp $filedir/INDOCK_5k_TolerantClash INDOCK     # CHANGE THIS

cp $filedir/${pdbname}/INDOCK .
 # modified the dock file using sed. here we change some key sampling parameters; sed -i changes input file internally (overwrites), -e changes file externally (pipes it to screen or into file if redirected)
#sed -i "s/bump_maximum                  50.0/bump_maximum                  500.0/g" INDOCK 
#sed -i "s/bump_rigid                    50.0/bump_rigid                    500.0/g" INDOCK 
#sed -i "s/check_clashes                 yes/check_clashes                 no/g" INDOCK 

ln -s $dbfile .

set dbf = `ls *.gz`

echo "./$dbf"

# says what to dock and where it sits
echo "./$dbf" > split_database_index

@ count = ${count} + 1
# counter is chuch dir

end # dbfile

echo ${workdir1}
cd  ${workdir1}
@ count = ${count} - 1

# writes submission script that runs dock on the sgehead queue
cat <<EOF > DOCKING_${db_type}.csh
#\$ -S /bin/csh
#\$ -cwd
#\$ -q all.q
#\$ -o stdout
#\$ -e stderr

set chunk = "chunk\$SGE_TASK_ID"
set workdir2 = ${workdir1}/\$chunk
cd \${workdir2}
echo "starting . . ." >! stderr
date >> stderr
echo $dock >> stderr
$dock >>& stderr
date >> stderr
echo "finished . . ." >> stderr

EOF


qsub -t 1-$count DOCKING_${db_type}.csh
# This command submits an array job. 

end # db_type
end # pdbname
  • Run the above script.
csh 0003_new.lig-decoy_enrichment.csh

The 0003_new.lig-decoy_enrichment.csh script above will ONLY submit one job but with 1-many subjobs.

Submit an enrichment calculation via 0003.lig-decoy_enrichment_submit.csh

We recommend using this method, rather than the earlier scripts, as it uses the DOCK submission infrastructure.

  • Write a file called 0003.lig-decoy_enrichment_submit.csh
#!/bin/csh

#This script provides a alternative way to dock a DUD-e like ligand-decoy-database for the enrichment evaluation of actives over decoys
#It assumes that ligands and decoys have been pre-prepation (see script blablabla_ToDo) which needs to be run in SF.

set filedir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial"  #CHANGE THIS
# this is where the work is done:
set mountdir = $filedir                         # Might CHANGE THIS
set dude_dir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial/databases"  # should contain decoy.smi and ligand.smi for ROC script 00005...csh
  ## TO DO - rename this outside in the dir structure and call in blbalbalbabla script
if (-s $dude_dir) then
 echo " $dude_dir exist"
else
 # this is something to modified in future. 
 # probably better to exit if it is not there.
 echo "databases do not exist. "
 echo "consider making a symbolic link to the database files"
endif

set list = "2HYY"  # CHANGE THIS (pdbname)
foreach pdbname ( $list )
# creates "ligands" and "decoys" and has the aim to dock all of the subsets for those two
foreach db_type ( "ligands" "decoys" )
set workdir1 = "${mountdir}/${pdbname}/${db_type}"
set workdir2 = "${mountdir}/${pdbname}"
#
echo $mountdir
echo $workdir1
echo $workdir2
#
mkdir -p  ${workdir1}
cd  ${workdir1}
#creat dirlist for *.db2.gz files prepared for docking
ls ${dude_dir}/${db_type}/*.db2.gz > ${db_type}_files.txt
#copy the files needed for dock
cp ${workdir2}/INDOCK ${workdir1}
ln -s ${workdir2}/dockfiles/ ${workdir1}
#use dirlist to creat chunks for job submission
python /nfs/home/tbalius/zzz.github/DOCK/docking/setup/setup_db2_zinc15_file_number.py ./ chunk ./${db_type}_files.txt 500  count
#
csh $DOCKBASE/docking/submit/submit.csh

end # db_type
end # pdbname
  • Run the above script
csh 0003.lig-decoy_enrichment_submit.csh