DOCK 3.7 2014/09/25 FXa Tutorial

From DISI
Jump to: navigation, search

Written by Trent E Balius and Crystal Nguyen on 2014/09/25.

Note that this script is writen in hope that it will help others run dock 3.7.

Note that DOCK 3.7 is primarily a virtual screening tool is is meant to be run on a Linux cluster. The focus of this tutorial is to perform a retrospective enrichment calculations and a test virtual screen.

This part of the tutorial uses the release version of DOCK3.7-beta.1.0.1.

see the following for requirements: Install_DOCK_3.7

You must use a version of python (v2.7) that has the follow:

  • numpy
  • scipy
  • matplotlib
  • mysql-python
  • biopython

If you are using a virtual enviorment for python do something like this:

 source /home/tbalius/zzz.virtualenv/virtualenv-1.9.1/myVE/bin/activate.csh

to leave use the deactivate command:

 deactivate

making ligand databases.

Get smiles:

for factor Xa it is part of the DUDE database:

  wget http://dude.docking.org/targets/fa10/actives_final.ism
  wget http://dude.docking.org/targets/fa10/decoys_final.ism

We can download the isomorphic smiles from the dude webpage.

here is a webserver to gerenate decoys.

 http://dude.docking.org/generate

MORE TO COME HERE ON DATABASE PREP...

Here is what I think is what needs to be done.

mkdir ligands
bash $DOCKBASE/ligand/generate/build_smiles_ligand.sh ../actives_final.ism
python $DOCKBASE/ligand/finish/db2end-makedata.py ligands
mkdir decoys
cd decoys
bash $DOCKBASE/ligand/generate/build_smiles_ligand.sh ../decoys_final.ism
python $DOCKBASE/ligand/finish/db2end-makedata.py decoys

DOCK Distribution structure

Here we discuss the directory structure of the distribution, what the different directory contain, and information about the scripts.

Here is the structure:

ls -l ~/DOCK-3.7-beta1.0.1/
total 27
-rw-r--r--  1 user group 2737 Mar 28 08:43 README.md
drwxr-xr-x  2 user group   31 Mar 28 08:43 analysis
drwxr-xr-x  2 user group   28 Mar 28 08:43 common
drwxr-xr-x  6 user group    7 Mar 28 08:43 docking
drwxr-xr-x  3 user group    3 Mar 28 08:43 install
drwxr-xr-x  8 user group    9 Mar 28 08:43 ligand
drwxr-xr-x 19 user group   19 Mar 28 08:43 proteins
drwxr-xr-x  3 user group   15 Mar 28 08:43 test

docking/DOCK contains the dock source code and the the dock binary. docking/submit contains submission scripts to submit jobs to the queue. ligand contains the ligand preparation steps. (this has dependencies not include in the distribution) proteins contains scripts and programs for protein preparations. test contains test to make sure the package is working as intended. Running the tests can help dignose errors and tell you what might be missing.


Receptor Preparation

1. Splite your pdb into a rec.pdb and xtal-lig.pdb.

1.a you can do this by running this script:

cat ~/FactorXa/dock_tutorial/0001.be_balsti_py.csh 
#!/bin/csh
#
# Written by Trent Balius, 2014.
# this script calls be_blasti.py which creates a receptor and ligand file from a pdb code or file.
#
# msms is a molecular surface generation program needed for be_blasti.py to run
# which is put in your path
set path = ( /home/tbalius/zzz.programs/msms $path )
#
set list = "FXa_reference_XLD.pdb" 
# or use `cat filename` to list your pdb codes here from a text file like pdblist_rat,
# to loop over each variable (pdb code) later
#set list = `cat $1`
#
# CHANGE THIS, according to where the magic is going to happen
 set mountdir = `pwd`
#
# loop over pdbnames e.g. 1DB1 or list
foreach pdbname ( $list )
#
echo " ${pdbname} "
#
# for each pdb makes a directory with its name
#set workdir = ${mountdir}/${pdbname}
set workdir = "${mountdir}/FXa"
#
## so you don't blow away stuff; continue means STOP here and continue with next pdb from list
if ( -s $workdir ) then
   echo "$workdir exits"
   continue
endif
#
mkdir -p ${workdir}
cd ${workdir}
#
cp ../$pdbname .
# the atom type definition is needed for msms which is sym-linked into the cwd
# this should be the path to msms for your computer system.  
  ln -s /home/tbalius/zzz.programs/msms/atmtypenumbers .
# carbs are disregarded as ligands! if it is: carbohydrate instead of nocarbohydrate
# renumber renumbers the residue number
# if file.
  python ${DOCKBASE}/proteins/pdb_breaker/be_blasti.py --pdbfile $pdbname  \
  nocarbohydrate original_numbers | tee  -a pdbinfo_using_biopython.log
# if code
#  python ${DOCKBASE}/proteins/pdb_breaker/be_blasti.py --pdbcode $pdbname \
#  nocarbohydrate original_numbers | tee -a pdbinfo_using_biopython.log
#
# error checking looks for receptor and ligand file which should be produced by be_blasti.py
  if !(-s rec.pdb) then
      echo "rec.pdb is not found"
  endif
#
  mv rec.pdb temp.pdb
  grep -v TER temp.pdb | grep -v END  > rec.pdb
#
  rm temp.pdb
# 
# be_blasti.py produces peptide which may be used as a ligand if no other ligand is produced
  if (-s lig.pdb) then
     sed -e "s/HETATM/ATOM  /g" lig.pdb > xtal-lig.pdb
  else if (-s pep.pdb) then ## if no ligand and peptide
     sed -e "s/HETATM/ATOM  /g" pep.pdb > xtal-lig.pdb
  else
     echo "Warning: No ligand or peptid."
  endif
end # system

1.b. You and run this for a pdbcode and the script will download it from the pdb and split it into a receptor (rec.pdb) and ligand (xtal-lig.pdb).

2. Generate spheres and grids:

2.a. this script will run blastermaster.py:

cat ~/FactorXa/dock_tutorial/0002.blastermaster.csh 
#!/bin/csh
# Written by Trent Balius, 2014.
# This script runs Ryan's blastermaster python masterscript for generating everything that dock needs, i.e. grids, spheres
# Run on sgehead as jobs are submitted to the queue

# list is same as in 001... script
set list = "FXa"
#set list = `cat $1`
#set list = `cat /nfs/work/users/tbalius/VDR/Enrichment/pdblist_all `

set mountdir = `pwd`
#set mountdir = "/nfs/work/users/tbalius/VDR/"

# loop over all pdb(s)
foreach pdbname ( $list )

echo "${pdbname}"

set workdir = ${mountdir}/${pdbname}

# checks that 001 ran successfully and produced the directory structure as expected
# if not stops with current pdb code and continues with next one in list
  if ! ( -s $workdir ) then
     echo "$workdir does not exit"
     continue
  endif

cd $workdir

#cat xtal-lig_ori.pdb | awk '{if ($1 == "ATOM" || $1 == "HETATM"){print $0}}' | sed -e "s/HETATM/ATOM  /g"  >  xtal-lig.pdb

# the following lines create a qsub script which submits blastermaster to the queue
# modify to your queuing system this is for PBS. 
cat <<EOF > qsub.csh
#!/bin/csh
#PBS -o stderr
#PBS -q glean
cd $workdir
$DOCKBASE/proteins/blastermaster/blastermaster.py --addhOptions=" -HIS -FLIPs "  -v
EOF

qsub qsub.csh

end # pdbname
# going to the next pdb
#
# this will produce two directories:
# 1) working - contains all input and output files that are generated; not needed afterwards but as a reference
# 2) dockfiles - contains everything that is needed to run dock (copied from working)
#    grids
#       trim.electrostatics.phi
#       vdw.vdw
#       vdw.bmp
#       ligand.desolv.heavy
#       ligand.desolv.hydrogen
#    spheres
#       matching_spheres.sph

2.b. the blastermaster script will generate spheres by calling sphgen, converting ligand into sphere, and processing them with other scripts.


The program sphgen will produce spheres that fill the nooks and crannies on the protein surface.

1FJS Receptor surface with all spheres from sphgen

The matching spheres are used during docking to orient the ligand into the binding pocket. The matching spheres are produced by converting the crystallographic ligand atoms into sphere and also using the spheres from sphgen.

1FJS Receptor surface with matching spheres

This image was made with chimera. To visualize the sphere there should be no sphere with a radius of zero. the recommendation is to change "0.00" to "0.70". Also you should remove the header from the file.

A final important set of spheres are the low dielectric spheres. These spheres are used during the PB calculation to force the binding site to have low dielectric.

1FJS Receptor surface with low dielectric spheres.

This image read shows the spheres in the rec.crg.pdb. This is the file on which qnifft runs the PB calculation.

2.c. The Script blastermaster.py will also produces grids.

#    grids
#       trim.electrostatics.phi
#       vdw.vdw
#       vdw.bmp
#       ligand.desolv.heavy
#       ligand.desolv.hydroge

3. Submit enrichment calculation 3.1 First you need a database. see the database generation section DOCK_3.7_2014/09/25_FXa_Tutorial#making_ligand_databases..

4. Process enrichment caluclations: Genereate Log-ajusted ROC curves and caluclate Log-ajusted AUC values.

5. Submit test virtural Screen

6. How to optmize protocol to get better results.

change sphere placement. modify receptor changes. modify sidechain rotomers.

pose visualization

Use chimera Veiwdock modual:

Tools/

Chimera Viewdock session for top 200 Frags.

This part of the tutorial uses the GIST development version of DOCK

Put your rec.pdb and xtal-lig.pdb in the same frame as you simulation.

This chimera script might help.

Rescoring with GIST dev version.

Three ligand files are needed for rescoring:

1. poses.mol2.gz

2. amsol_desov.txt.gz

3. vdw.txt.gz

rescoring with DOCK 3.7