DOCK 3.7 2014/09/25 FXa Tutorial
Written by Trent E Balius and Crystal Nguyen on 2014/09/25.
Note that this script is writen in hope that it will help others run dock 3.7.
Note that DOCK 3.7 is primarily a virtual screening tool is is meant to be run on a Linux cluster. The focus of this tutorial is to perform a retrospective enrichment calculations and a test virtual screen.
This part of the tutorial uses the release version of DOCK3.7-beta.1.0.1.
see the following for requirements: Install_DOCK_3.7
You must use a version of python (v2.7) that has the follow:
- numpy
- scipy
- matplotlib
- mysql-python
- biopython
If you are using a virtual enviorment for python do something like this:
source /home/tbalius/zzz.virtualenv/virtualenv-1.9.1/myVE/bin/activate.csh
to leave use the deactivate command:
deactivate
making ligand databases.
Get smiles:
for factor Xa it is part of the DUDE database:
wget http://dude.docking.org/targets/fa10/actives_final.ism wget http://dude.docking.org/targets/fa10/decoys_final.ism
We can download the isomorphic smiles from the dude webpage.
here is a webserver to gerenate decoys.
http://dude.docking.org/generate
MORE TO COME HERE ON DATABASE PREP...
Here is what I think is what needs to be done.
mkdir ligands bash $DOCKBASE/ligand/generate/build_smiles_ligand.sh ../actives_final.ism python $DOCKBASE/ligand/finish/db2end-makedata.py ligands
mkdir decoys cd decoys bash $DOCKBASE/ligand/generate/build_smiles_ligand.sh ../decoys_final.ism python $DOCKBASE/ligand/finish/db2end-makedata.py decoys
DOCK Distribution structure
Here we discuss the directory structure of the distribution, what the different directory contain, and information about the scripts.
Here is the structure:
ls -l ~/DOCK-3.7-beta1.0.1/ total 27 -rw-r--r-- 1 user group 2737 Mar 28 08:43 README.md drwxr-xr-x 2 user group 31 Mar 28 08:43 analysis drwxr-xr-x 2 user group 28 Mar 28 08:43 common drwxr-xr-x 6 user group 7 Mar 28 08:43 docking drwxr-xr-x 3 user group 3 Mar 28 08:43 install drwxr-xr-x 8 user group 9 Mar 28 08:43 ligand drwxr-xr-x 19 user group 19 Mar 28 08:43 proteins drwxr-xr-x 3 user group 15 Mar 28 08:43 test
docking/DOCK contains the dock source code and the the dock binary. docking/submit contains submission scripts to submit jobs to the queue. ligand contains the ligand preparation steps. (this has dependencies not include in the distribution) proteins contains scripts and programs for protein preparations. test contains test to make sure the package is working as intended. Running the tests can help dignose errors and tell you what might be missing.
Receptor Preparation
1. Splite your pdb into a rec.pdb and xtal-lig.pdb.
1.a you can do this by running this script:
cat ~/FactorXa/dock_tutorial/0001.be_balsti_py.csh #!/bin/csh # # Written by Trent Balius, 2014. # this script calls be_blasti.py which creates a receptor and ligand file from a pdb code or file. # # msms is a molecular surface generation program needed for be_blasti.py to run # which is put in your path set path = ( /home/tbalius/zzz.programs/msms $path ) # set list = "FXa_reference_XLD.pdb" # or use `cat filename` to list your pdb codes here from a text file like pdblist_rat, # to loop over each variable (pdb code) later #set list = `cat $1` # # CHANGE THIS, according to where the magic is going to happen set mountdir = `pwd` # # loop over pdbnames e.g. 1DB1 or list foreach pdbname ( $list ) # echo " ${pdbname} " # # for each pdb makes a directory with its name #set workdir = ${mountdir}/${pdbname} set workdir = "${mountdir}/FXa" # ## so you don't blow away stuff; continue means STOP here and continue with next pdb from list if ( -s $workdir ) then echo "$workdir exits" continue endif # mkdir -p ${workdir} cd ${workdir} # cp ../$pdbname . # the atom type definition is needed for msms which is sym-linked into the cwd # this should be the path to msms for your computer system. ln -s /home/tbalius/zzz.programs/msms/atmtypenumbers . # carbs are disregarded as ligands! if it is: carbohydrate instead of nocarbohydrate # renumber renumbers the residue number # if file. python ${DOCKBASE}/proteins/pdb_breaker/be_blasti.py --pdbfile $pdbname \ nocarbohydrate original_numbers | tee -a pdbinfo_using_biopython.log # if code # python ${DOCKBASE}/proteins/pdb_breaker/be_blasti.py --pdbcode $pdbname \ # nocarbohydrate original_numbers | tee -a pdbinfo_using_biopython.log # # error checking looks for receptor and ligand file which should be produced by be_blasti.py if !(-s rec.pdb) then echo "rec.pdb is not found" endif # mv rec.pdb temp.pdb grep -v TER temp.pdb | grep -v END > rec.pdb # rm temp.pdb # # be_blasti.py produces peptide which may be used as a ligand if no other ligand is produced if (-s lig.pdb) then sed -e "s/HETATM/ATOM /g" lig.pdb > xtal-lig.pdb else if (-s pep.pdb) then ## if no ligand and peptide sed -e "s/HETATM/ATOM /g" pep.pdb > xtal-lig.pdb else echo "Warning: No ligand or peptid." endif end # system
1.b. You and run this for a pdbcode and the script will download it from the pdb and split it into a receptor (rec.pdb) and ligand (xtal-lig.pdb).
2. Generate spheres and grids:
2.a. this script will run blastermaster.py:
cat ~/FactorXa/dock_tutorial/0002.blastermaster.csh #!/bin/csh # Written by Trent Balius, 2014. # This script runs Ryan's blastermaster python masterscript for generating everything that dock needs, i.e. grids, spheres # Run on sgehead as jobs are submitted to the queue # list is same as in 001... script set list = "FXa" #set list = `cat $1` #set list = `cat /nfs/work/users/tbalius/VDR/Enrichment/pdblist_all ` set mountdir = `pwd` #set mountdir = "/nfs/work/users/tbalius/VDR/" # loop over all pdb(s) foreach pdbname ( $list ) echo "${pdbname}" set workdir = ${mountdir}/${pdbname} # checks that 001 ran successfully and produced the directory structure as expected # if not stops with current pdb code and continues with next one in list if ! ( -s $workdir ) then echo "$workdir does not exit" continue endif cd $workdir #cat xtal-lig_ori.pdb | awk '{if ($1 == "ATOM" || $1 == "HETATM"){print $0}}' | sed -e "s/HETATM/ATOM /g" > xtal-lig.pdb # the following lines create a qsub script which submits blastermaster to the queue # modify to your queuing system this is for PBS. cat <<EOF > qsub.csh #!/bin/csh #PBS -o stderr #PBS -q glean cd $workdir $DOCKBASE/proteins/blastermaster/blastermaster.py --addhOptions=" -HIS -FLIPs " -v EOF qsub qsub.csh end # pdbname # going to the next pdb # # this will produce two directories: # 1) working - contains all input and output files that are generated; not needed afterwards but as a reference # 2) dockfiles - contains everything that is needed to run dock (copied from working) # grids # trim.electrostatics.phi # vdw.vdw # vdw.bmp # ligand.desolv.heavy # ligand.desolv.hydrogen # spheres # matching_spheres.sph
2.b. the blastermaster script will generate spheres by calling sphgen, converting ligand into sphere, and processing them with other scripts.
The program sphgen will produce spheres that fill the nooks and crannies on the protein surface.
The matching spheres are used during docking to orient the ligand into the binding pocket. The matching spheres are produced by converting the crystallographic ligand atoms into sphere and also using the spheres from sphgen.
This image was made with chimera. To visualize the sphere there should be no sphere with a radius of zero. the recommendation is to change "0.00" to "0.70". Also you should remove the header from the file.
A final important set of spheres are the low dielectric spheres. These spheres are used during the PB calculation to force the binding site to have low dielectric.
This image read shows the spheres in the rec.crg.pdb. This is the file on which qnifft runs the PB calculation.
2.c. The Script blastermaster.py will also produces grids.
# grids # trim.electrostatics.phi # vdw.vdw # vdw.bmp # ligand.desolv.heavy # ligand.desolv.hydroge
3. Submit enrichment calculation 3.1 First you need a database. see the database generation section DOCK_3.7_2014/09/25_FXa_Tutorial#making_ligand_databases..
4. Process enrichment caluclations: Genereate Log-ajusted ROC curves and caluclate Log-ajusted AUC values.
5. Submit test virtural Screen
6. How to optmize protocol to get better results.
change sphere placement. modify receptor changes. modify sidechain rotomers.
pose visualization
Use chimera Veiwdock modual:
Tools/
This part of the tutorial uses the GIST development version of DOCK
Put your rec.pdb and xtal-lig.pdb in the same frame as you simulation.
This chimera script might help.
Rescoring with GIST dev version.
Three ligand files are needed for rescoring:
1. poses.mol2.gz
2. amsol_desov.txt.gz
3. vdw.txt.gz