DOCK 3.7 2015/04/15 abl1 Tutorial

From DISI
Revision as of 14:41, 15 April 2015 by TBalius (Talk | contribs)

Jump to: navigation, search

This tutoral use the 3.7.2 beta version of dock release on XXX.

This is for a Linux environment and the scripts assume that you are running on SGE queueing system.

set up directories and get databases

Create directory called "RotationProject"

create a python file called "autodude_db_download.py"

# this gets the database from the autodude webpage

import sys, os
import urllib

system = 'abl1'
url = 'http://autodude.docking.org/dude_e_db2/'

print "url = " + url

#page=requests.get(url)

webfile = urllib.urlopen(url)
page    = webfile.read()
webfile.close()

splitpage=page.split('\n')

for line in splitpage:
   if system in line: 
      file = line.replace('"',' ').split()[2]
      print url+file
      urllib.urlretrieve(url+file,file)

     # exit()

This python script will download the dockable db2 databases from the autodude webpage.

python /mnt/nfs/home/rstein/RotationProject/autodude_db_download.py 

make a subdirectory called databases:

mkdir databases

go inside.

cd databases

make directories for ligands and decoys and move the corresponding files into those directories

mkdir decoys 
mv decoys*db2.gz decoys
mkdir ligands 
mv ligands*db2.gz ligands

download the ligand and decoy isomeric smiles file:

wget http://autodude.docking.org/abl1/decoys_final.ism
mv decoys_final.ism decoys.ism

note that the scripts expect the name to be decoys.ism, so we changed the name.

wget http://autodude.docking.org/abl1/actives_final.ism
mv actives_final.ism ligands.ism

run be_blasti.py

creat the following cshell script 0001.be_balsti_py.csh.

#!/bin/csh 

# this script calls be_blasti.py which creates a receptor and ligand file from a (list of) pdbcode(s).

# msms is a molecular surface generation program needed for be_blasti.py to run
# which is put in your path
set path = ( /nfs/home/tbalius/zzz.programs/msms $path )
# you will need to have msms on you system.   

set list = "2HYY" # or use `cat filename` to list your pdb codes here from a text file like pdblist_rat, to loop over each variable (pdb code) later
#set list = `cat $1`
#set list = `cat /nfs/work/users/tbalius/VDR/Enrichment/pdblist_rat `

# CHANGE THIS, according to where the magic is going to happen
#set mountdir = "/mnt/nfs/work/users/tbalius/VDR/"
set mountdir = `pwd` 

# loop over pdbnames e.g. 1DB1 or list
foreach pdbname ( $list )

echo " ${pdbname} "

# for each pdb makes a directory with its name
set workdir = ${mountdir}/${pdbname}

## so you don't blow away stuff; continue means STOP here and continue with next pdb from list
if ( -s $workdir ) then
   echo "$workdir exits"
   continue
endif

  mkdir -p ${workdir}
  cd ${workdir}

# the atom type definition is needed for msms which is sym-linked into the cwd
  ln -s /nfs/home/tbalius/zzz.programs/msms/atmtypenumbers .
# carbs are disregarded as ligands! if it is: carbohydrate instead of nocarbohydrate
# renumber renumbers the residue number
  python $DOCKBASE/proteins/pdb_breaker/be_blasti.py --pdbcode $pdbname nocarbohydrate original_numbers | tee -a pdbinfo_using_biopython.log

# error checking looks for receptor and ligand file which should be produced by be_blasti.py
  if !(-s rec.pdb) then
      echo "rec.pdb is not found"
  endif

  mv rec.pdb temp.pdb
  grep -v TER temp.pdb | grep -v END  > rec.pdb

  rm temp.pdb

# be_blasti.py produces peptide which may be used as a ligand if no other ligand is produced
  if (-s lig.pdb) then
     sed -e "s/HETATM/ATOM  /g" lig.pdb > xtal-lig.pdb
  else if (-s pep.pdb) then ## if no ligand and peptide
     sed -e "s/HETATM/ATOM  /g" pep.pdb > xtal-lig.pdb
  else
     echo "Warning: No ligand or peptid."
  endif

end # system


running 0001.be_balsti_py.csh will run a script that come with dock call be_blasti. And it will do the following

  1. download the pdb file from the web,
  2. break the file into rec and ligand componates

Note that you will need to have msms on you system. get msms

check to make sure that the right ligand was selected and the the residue is not missing anything of importants. If this automatic procigure has not perpared these files correctly then modify them.

Visulize them with chimera or an alternive visulazation program like pymol.

cd 2HYY
chimera rec.pdb lig.pdb
1FJS the receptor and ligand generated from be_blasti.py.

run blastermaster.py

run enrichments

Write a file called 0003.lig-decoy_enrichment.csh

#!/bin/csh

#This script docks a DUD-e like ligand-decoy-database to evaluate the enrichment performance of actives over decoys
#It assumes that ligands and decoys have been pre-prepation (see script blablabla_ToDo) which needs to be run in SF.

# filedir is where your rec.pdb and xtal-lig.pdb and dockfiles directory live 
set filedir = "/mnt/nfs/home/rstein/RotationProject"	#CHANGE THIS
# this is where the work is done:
set mountdir = $filedir				# Might CHANGE THIS
set dude_dir = "/mnt/nfs/home/rstein/RotationProject/databases"  # should contain decoy.smi and ligand.smi for ROC script 00005...csh
  ## TO DO - rename this outside in the dir structure and call in blbalbalbabla script
if (-s $dude_dir) then 
  echo " $dude_dir exist"
else
  # this is something to modified in future. 
  # probably better to exit if it is not there.
  echo "databases do not exist. "
  echo "consider making a symbolic link to the database files"
  #echo "making a symbolic link:"
  #echo "ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir"
  #ln -s /mnt/nfs/work/users/fischer/VDR/27Jan2014_learningDOCKrgc/databases_all_xtal-ligand_decoy $dude_dir
endif

# change if you want to use a different or consistent dock version
set dock = ${DOCK_BASE}/bin/Linux/dock3.7_flex/dock.csh

set list = "1IEP" 
#set list = `cat $1`
#set list = `cat file`
        			# CHANGE THIS (pdbname)
foreach pdbname ( $list )

# creates "ligands" and "decoys" and has the aim to dock all of the subsets for those two
foreach db_type ( "ligands" "decoys" )

set workdir1 = "${mountdir}/${pdbname}/ligands-decoys/${db_type}"

mkdir -p  ${workdir1}
cd  ${workdir1} 
# puts dockfiles in the right relative-path that INDOCK file expects
ln -s $filedir/${pdbname}/dockfiles .

set count = '1'

# loop over database files to put each into a seperate chunk
foreach dbfile (`ls $dude_dir/${db_type}/${db_type}*.db2.gz`)

echo $dbfile

set chunk = "chunk$count"

set workdir2 = ${workdir1}/$chunk

## so you don't blow away stuff
if ( -s $workdir2 ) then
   echo "$workdir2 exits"
   continue
endif

#rm -rf ${workdir}
mkdir -p ${workdir2}
cd ${workdir2}

# copy INDOCK file of choice in right location
#cp $filedir/zzz.dock3_input/INDOCK . 
#cp $filedir/INDOCK_match20K INDOCK
#cp $filedir/INDOCK_5k_TolerantClash INDOCK	# CHANGE THIS
cp $filedir/${pdbname}/INDOCK .
 # modified the dock file using sed. here we change some key sampling parameters; sed -i changes input file internally (overwrites), -e changes file externally (pipes it to screen or into file if redirected)
#sed -i "s/bump_maximum                  50.0/bump_maximum                  500.0/g" INDOCK 
#sed -i "s/bump_rigid                    50.0/bump_rigid                    500.0/g" INDOCK 
#sed -i "s/check_clashes                 yes/check_clashes                 no/g" INDOCK 

ln -s $dbfile . 

set dbf = `ls *.gz`

echo "./$dbf"

# says what to dock and where it sits
echo "./$dbf" > split_database_index

# writes submission script that runs dock on the sgehead queue
cat <<EOF > DOCKING_${db_type}.csh
#\$ -S /bin/csh
#\$ -cwd
#\$ -q all.q
#\$ -o stdout
#\$ -e stderr

cd ${workdir2}
echo "starting . . ."
date
echo $dock 
$dock
date
echo "finished . . ."

EOF

qsub DOCKING_${db_type}.csh
# alternatively if you don't want to run it on the queue but locally comment in this instead:
#csh DOCKING_${lig_type}.csh &

@ count = ${count} + 1 
# counter is chuch dir

end # dbfile
end # db_type
end # pdbname

run enrichment calucaltions