DOCK 3.7 2016/09/16 Tutorial for Enrichment Calculations (Trent & Jiankun)
This tutorial uses the DOCK-3.7-beta3. This is a supplemental tutorial for DOCK 3.7 2015/04/15 abl1 Tutorial. Former one works well and is very simple and clear. This tutorial mentions some updates in job submissions.
This is for a Linux environment and the scripts assume that you are running on SGE queueing system.
Submit an enrichment calculation via new 0003.lig-decoy_enrichment.csh
In this section we present a modified version of script here [[1]].
Here we modify the script to use an array job rather than submit every chuck individually as was done previously.
- Go to the working directory of your account
cd /nfs/home/$USERNAME/work/DOCK_tutorial
PS: $USERNAME is your username log on the cluster.
The hierarchy of the working directory:
DOCK_tutorial ----- 2HYY ------ dockfiles
| |
| |----- working
| |
| ------ other files generated by 0001.be_balsti_py.csh
|
----- databases ------ decoys
|
|----- ligands
|
------ ligands.ism and decoys.ism files
- Write a file called 0003_new.lig-decoy_enrichment.csh
#!/bin/csh
#This script docks a DUD-e like ligand-decoy-database to evaluate the enrichment performance of actives over decoys
#It assumes that ligands and decoys have been pre-prepation (see script blablabla_ToDo) which needs to be run in SF.
# filedir is where your rec.pdb and xtal-lig.pdb and dockfiles directory live
set filedir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial" #CHANGE THIS
# this is where the work is done:
set mountdir = $filedir # Might CHANGE THIS
set dude_dir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial/databases" # should contain decoy.smi and ligand.smi for ROC script 00005...csh
## TO DO - rename this outside in the dir structure and call in blbalbalbabla script
if (-s $dude_dir) then
echo " $dude_dir exist"
else
# this is something to modified in future.
# probably better to exit if it is not there.
echo "databases do not exist. "
echo "consider making a symbolic link to the database files"
#echo "making a symbolic link:"
#echo "ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir"
#ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir
endif
# change if you want to use a different or consistent dock version
set dock = ${DOCKBASE}/docking/DOCK/bin/dock64
#set dock = ${DOCKBASE}/docking/submit/submit.csh
set list = "2HYY"
#set list = `cat $1`
#set list = `cat file`
# CHANGE THIS (pdbname)
foreach pdbname ( $list )
# creates "ligands" and "decoys" and has the aim to dock all of the subsets for those two
foreach db_type ( "ligands" "decoys" )
set workdir1 = "${mountdir}/${pdbname}/ligands-decoys/${db_type}"
echo $mountdir
echo $workdir1
#exit
mkdir -p ${workdir1}
cd ${workdir1}
# puts dockfiles in the right relative-path that INDOCK file expects
ln -s $filedir/${pdbname}/dockfiles .
set count = '1'
# loop over database files to put each into a seperate chunk
foreach dbfile (`ls $dude_dir/${db_type}/${db_type}*.db2.gz`)
echo $dbfile
set chunk = "chunk$count"
set workdir2 = ${workdir1}/$chunk
## so you don't blow away stuff
if ( -s $workdir2 ) then
echo "$workdir2 exits"
continue
endif
#rm -rf ${workdir}
mkdir -p ${workdir2}
cd ${workdir2}
# copy INDOCK file of choice in right location
#cp $filedir/zzz.dock3_input/INDOCK .
#cp $filedir/INDOCK_match20K INDOCK
#cp $filedir/INDOCK_5k_TolerantClash INDOCK # CHANGE THIS
cp $filedir/${pdbname}/INDOCK .
# modified the dock file using sed. here we change some key sampling parameters; sed -i changes input file internally (overwrites), -e changes file externally (pipes it to screen or into file if redirected)
#sed -i "s/bump_maximum 50.0/bump_maximum 500.0/g" INDOCK
#sed -i "s/bump_rigid 50.0/bump_rigid 500.0/g" INDOCK
#sed -i "s/check_clashes yes/check_clashes no/g" INDOCK
ln -s $dbfile .
set dbf = `ls *.gz`
echo "./$dbf"
# says what to dock and where it sits
echo "./$dbf" > split_database_index
@ count = ${count} + 1
# counter is chuch dir
end # dbfile
echo ${workdir1}
cd ${workdir1}
@ count = ${count} - 1
# writes submission script that runs dock on the sgehead queue
cat <<EOF > DOCKING_${db_type}.csh
#\$ -S /bin/csh
#\$ -cwd
#\$ -q all.q
#\$ -o stdout
#\$ -e stderr
set chunk = "chunk\$SGE_TASK_ID"
set workdir2 = ${workdir1}/\$chunk
cd \${workdir2}
echo "starting . . ." >! stderr
date >> stderr
echo $dock >> stderr
$dock >>& stderr
date >> stderr
echo "finished . . ." >> stderr
EOF
qsub -t 1-$count DOCKING_${db_type}.csh
# This command submits an array job.
end # db_type
end # pdbname
- Run the above script.
csh 0003_new.lig-decoy_enrichment.csh
The 0003_new.lig-decoy_enrichment.csh script above will ONLY submit one job but with 1-many subjobs.
Submit an enrichment calculation via 0003.lig-decoy_enrichment_submit.csh
We recommend using this method, rather than the earlier scripts, as it uses the DOCK submission infrastructure.
- Write a file called 0003.lig-decoy_enrichment_submit.csh
#!/bin/csh
#This script provides a alternative way to dock a DUD-e like ligand-decoy-database for the enrichment evaluation of actives over decoys
#It assumes that ligands and decoys have been pre-prepation (see script blablabla_ToDo) which needs to be run in SF.
set filedir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial" #CHANGE THIS
# this is where the work is done:
set mountdir = $filedir # Might CHANGE THIS
set dude_dir = "/mnt/nfs/home/jklyu/work/DOCK_tutorial/databases" # should contain decoy.smi and ligand.smi for ROC script 00005...csh
## TO DO - rename this outside in the dir structure and call in blbalbalbabla script
if (-s $dude_dir) then
echo " $dude_dir exist"
else
# this is something to modified in future.
# probably better to exit if it is not there.
echo "databases do not exist. "
echo "consider making a symbolic link to the database files"
endif
set list = "2HYY" # CHANGE THIS (pdbname)
foreach pdbname ( $list )
# creates "ligands" and "decoys" and has the aim to dock all of the subsets for those two
foreach db_type ( "ligands" "decoys" )
set workdir1 = "${mountdir}/${pdbname}/${db_type}"
set workdir2 = "${mountdir}/${pdbname}"
#
echo $mountdir
echo $workdir1
echo $workdir2
#
mkdir -p ${workdir1}
cd ${workdir1}
#creat dirlist for *.db2.gz files prepared for docking
ls ${dude_dir}/${db_type}/*.db2.gz > ${db_type}_files.txt
#copy the files needed for dock
cp ${workdir2}/INDOCK ${workdir1}
ln -s ${workdir2}/dockfiles/ ${workdir1}
#use dirlist to creat chunks for job submission
python /nfs/home/tbalius/zzz.github/DOCK/docking/setup/setup_db2_zinc15_file_number.py ./ chunk ./${db_type}_files.txt 500 count
#
csh $DOCKBASE/docking/submit/submit.csh
end # db_type
end # pdbname
- Run the above script
csh 0003.lig-decoy_enrichment_submit.csh