DOCKovalent 3.7: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
 
(13 intermediate revisions by 4 users not shown)
Line 1: Line 1:
DOCKovalent is the covalent docking version of DOCK. Originally implemented within DOCK 3.6, it was ported over to DOCK 3.7 on March 2015. The most current publication to cite DOCKovalent is:
DOCKovalent is the covalent docking version of DOCK. Originally implemented within DOCK 3.6, it was ported over to DOCK 3.7 on March 2015. The most current publication to cite DOCKovalent is:


"Covalent docking of large libraries for the discovery of chemical probes" Nir London, Rand M Miller, Shyam Krishnan, Kenji Uchida, John J Irwin, Oliv Eidam, Lucie Gibold, Peter Cimermančič, Richard Bonnet, Brian K Shoichet & Jack Taunton. Nature Chemical Biology 10, 1066–1072 (2014) doi:10.1038/nchembio.1666
[http://www.nature.com/nchembio/journal/v10/n12/full/nchembio.1666.html "Covalent docking of large libraries for the discovery of chemical probes" Nir London, Rand M Miller, Shyam Krishnan, Kenji Uchida, John J Irwin, Oliv Eidam, Lucie Gibold, Peter Cimermančič, Richard Bonnet, Brian K Shoichet & Jack Taunton. Nature Chemical Biology 10, 1066–1072 (2014) doi:10.1038/nchembio.1666]


Below are instructions on various aspects related to using this version of DOCKovalent. The running example for usage would be docking of cyanoacrylamides (doubly activated Michael's acceptors) to an active site cysteine in RSK2 kinase (PDB ID: 4d9t)
Below are instructions on various aspects related to using this version of DOCKovalent. The running example for usage would be docking of cyanoacrylamides (doubly activated Michael's acceptors) to an active site cysteine in RSK2 kinase (PDB ID: 4d9t)


== Introduction and short description of the protocol ==
== General setup ==
 
* Apply for a DOCK3.7 license:
** go here: [[http://dock.compbio.ucsf.edu/Online_Licensing/dock_license_application.html dock Academic License]]
** select version DOCK3.7
** fill in the required information
** be patient because a real person approve the license and also get your e-mails.
** Download the tar.gz file and copy it to your desired location on the cluster.
** tar -zxvf [file name].tar.gz
* You have checked out DOCK3.7.x and have it under /path/to/DOCK
** setenv DOCKBASE /path/to/DOCK
The rest of the tutorial will assume DOCKBASE is set.
 
* you need an amsol binary, you can get amsol 7.1<ref>[http://comp.chem.umn.edu/amsol/ Get amsol 7.1]</ref> and apply a patch to make it comparable with our docking methods [[amsol 7 patch]].  Or if '''you are in the lab''' make a symbolic link to this amsol version:
  ln -s /nfs/home/londonir/code/git_trunk/DOCK/ligand/amsol/amsol7.1 .


== Custom Ligand / Library Generation ==
== Custom Ligand / Library Generation ==
* To generate a single / few ligands:
'''To generate a single / few ligands:'''
* make a directory to generate the db2 files:
<code> mkdir lig_db2 </code>
* make a smiles file containing the ligands you wish to DOCK. The Covalent attachment point should be marked with a [SiH3] group that would be removed during the ligand preparation. You can read more on how to generate such smiles for a specific subset and reaction scheme using the new ZINC api [link required].
* For our example docking, this is the smiles of the crystal ligand with the covalent attachment point marked (lig.smi):
<code>COC(=O)C(C#N)C([SiH3])c3c(c1ccc(C)cc1)c2c(N)ncnc2n3CCCO xtal-lig</code>
 
<code>cp lig.smi lig_db2</code>
* run the ligand generation script with a covalent flag (must be second argument)
<code> cd lig_db2 </code>


* To generate a large library of ligands for covalent docking:
<code>$DOCKBASE/ligand/generate/build_smiles_ligand.sh lig.smi --covalent </code>


The output of the script should be a lig.db2.gz file. You can step out of the db2 generation dir.


* Currently available covalent libraries (At the Shoichet lab / cluster 2)
'''To generate a large library of ligands for covalent docking:'''
* First get the smiles file of the library, where the covalent attachment point should be marked by an [SiH3] dummy atom.
Important - make sure the smiles explicitly notes [SiH3] and not just [Si]. Eventually the [SiH3] would be removed and replaced by the covalently attached atom from the protein side.
* This command will send the library preparation scripts to the cluster (run on gimel). First argument is the smiles file, Second is how many directories you wish to make (chunks). Third is a prefix name for the dirs.
<code> $DOCKBASE/ligand/generate/build_covalent_lib.csh example.ism 100 prefix </code>
* After all compounds are done, combine the db2 gz files to one directory, if everything went well, you can later delete all intermediate files and just keep the gz_files/ dir.
<code> $DOCKBASE/ligand/generate/combine_gz_files.sh </code>
 
'''Currently available covalent libraries (At the Shoichet lab / cluster 2)'''
[to be added]


== Protein preparation ==
== Protein preparation ==
Line 23: Line 57:
* Extract the crystal ligand (e..g grep "0JG A" 4d9t.pdb > xtal-lig.pdb)
* Extract the crystal ligand (e..g grep "0JG A" 4d9t.pdb > xtal-lig.pdb)
* Execute the protein preparation script while indicating the covalent attachment point, in this case CYS 436
* Execute the protein preparation script while indicating the covalent attachment point, in this case CYS 436
<code> DOCK/proteins/blastermaster/blastermaster.py -r 4d9t.pdb --covalentResNum 436 --covalentResName CYS --covalentResAtoms HG </code>
<code> $DOCKBASE/proteins/blastermaster/blastermaster.py --covalentResNum 436 --covalentResName CYS --covalentResAtoms HG </code>


Note that if you want to dock to a Serine residue you also need to de-protonate the HG atom but for Lysine for instance you should remove all three protons: <code>  --covalentResNum 123 --covalentResName LYS --covalentResAtoms HZ1,HZ2,HZ3 </code>
Note 1: the default receptor and ligand file names are rec.pdb and xtal-lig.pdb but you can specify other names using -r and -l
 
Note 2 if you want to dock to a Serine residue you also need to de-protonate the HG atom but for Lysine for instance you should remove all three protons: <code>  --covalentResNum 123 --covalentResName LYS --covalentResAtoms HZ1,HZ2,HZ3 </code>


The output of this script is an INDOCK file (see below), a "working" dir which contains temporary files that were generated during the preparation and a "dockfiles" dir that is required for the docking run (containing e.g. the scoring grids).  
The output of this script is an INDOCK file (see below), a "working" dir which contains temporary files that were generated during the preparation and a "dockfiles" dir that is required for the docking run (containing e.g. the scoring grids).  
Line 32: Line 68:
== INDOCK parameters ==
== INDOCK parameters ==


By default covalent docking is turned off (weird right?) edit the INDOCK file to turn it on by changing  
By default covalent docking is turned off (weird right?)  
dockovalent      no -> yes
* Edit the INDOCK file to turn it on by changing dockovalent      no -> yes
* chemical_matching should be set to "no" as a) there is no need for matching when only sampling around the covalent bond and b) the covalent code overrides the coloring code.


Description of parameters:
Description of covalent docking related parameters:


'''bond_len''' - the 'ideal' covalent bond length  
'''bond_len''' - the 'ideal' covalent bond length  
Line 51: Line 88:
'''bond_ang2''' - the 'ideal' bond angle for SG-Lig1-Lig2 (look at length explantation above for '''ang2_range''', and '''ang2_step''')
'''bond_ang2''' - the 'ideal' bond angle for SG-Lig1-Lig2 (look at length explantation above for '''ang2_range''', and '''ang2_step''')
   
   
The default parameters for sampling (bond length and two bond angels are suitable for thioethers (e.g. when docking acrylamides to cysteines) these should be changed for different electrophile/nucleophile combinations.  
The default parameters for sampling (bond length and two bond angles are suitable for thioethers (e.g. when docking acrylamides to cysteines) these should be changed for different electrophile/nucleophile combinations.  
Some parameters you might want to use are listed below, but more work is being done in this vein as you read these lines.  
Some parameters you might want to use are listed below, but more work is being done in this vein as you read these lines.  
* It is also (highly) recommended to allow bumping in a covalent run, especially of the rigid part.
Set '''bump_rigid''' to 100.0 and '''bump_maximum''' to 100.0 - one can play around with these to balance speed and sampling.
* Change '''check_clashes''' to '''no'''.


== Sending a run ==
== Sending a run ==


* First setup the directory structure for docking:
  <code>$DOCKBASE/docking/setup/setup_db2.csh /path/to/ligand/lig_db2/ </code>
tip: use FULL path to the db files (and not ../../etc)
For each .db2.gz file this will create a separate running dir
* Setting up directory structure for docking of pre-computed ligand libraries (here we use alkylhalides as an example):
  mkdir run.alkylhalides
  cd run.alkylhalides/
  cp ../INDOCK .
  ln -s ../dockfiles/ .
  <code>$DOCKBASE/docking/setup/setup_db2.csh path/to/ligand/gz_files/ </code>
Path to commercially available pre-computed ligand libraries:
Alkyl-halides:
/mnt/nfs/work/londonir/CovalentLibs/alkyl-halides/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/alkyl-halides/lead-like/gz_files/
Unsaturated carbonyls:
/mnt/nfs/work/londonir/CovalentLibs/alpha-sub-acrylate-esters/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/alpha-sub-acrylate-esters/lead-like/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/beta-sub-acrylate-esters/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/beta-sub-acrylate-esters/lead-like/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/ketone-based-enones/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/ketone-based-enones/lead-like/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/unsubstituted-acrylamides/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/unsubstituted-acrylamides/lead-like/gz_files/
Heterocyclic nitriles:
/mnt/nfs/work/londonir/CovalentLibs/heterocyclic-nitriles/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/heterocyclic-nitriles/lead-like/gz_files/
* Submit the docking job:
- locally:


== Interpretation of the results ==
  cd lig
  <code>$DOCKBASE/docking/DOCK/bin/dock64 </code>
 
- on cluster2:
 
  ssh to cluster2
  cd run.alkylhalides
  <code>$DOCKBASE/docking/submit/submit.csh </code>
 
== Analysis and interpretation of the results ==
 
<code> cd .. </code>
* Combine the docking results and make a scores file:
<code> $DOCKBASE/analysis/extract_all.py </code>
* Extract the docked poses:
<code> $DOCKBASE/analysis/getposes.py </code>
 
Congratulations you've now successfully docked covalently. A couple of pointers for compound selection for testing:


* Due to the ignorance of the scoring function to the covalent bond, scores tend to be higher than non covalent docking scores, even positive at times. As an example just covalently docking a methyl on to a cysteine without it clashing with anything will give a VDW score of ~+10. So positive VDW scores should not deter you from choosing what may look by eye like a good pose.  
* Due to the ignorance of the scoring function to the covalent bond, scores tend to be higher than non covalent docking scores, even positive at times. As an example just covalently docking a methyl on to a cysteine without it clashing with anything will give a VDW score of ~+10. So positive VDW scores should not deter you from choosing what may look by eye like a good pose.  


* Different electrophiles have different inherent reactivity. This is not taken into account in any way during the docking. The docked library should be matched to the application you are interested in. If you are looking for a non-toxic compound that may be active in cells you might consider unsubstituted acrylamides that are considered mild. If on the other hand you are looking for something that can label your protein in-vitro for crystallization studies Bromo-acetamide is very very reactive. Most of the covalent docking libraries were designed with fairly mild electrophiles but keep this consideration in mind when selecting which library to dock and which compounds to test.
* Different electrophiles have different inherent reactivity. This is not taken into account in any way during the docking. The docked library should be matched to the application you are interested in. If you are looking for a non-toxic compound that may be active in cells you might consider unsubstituted acrylamides that are considered mild. If on the other hand you are looking for something that can label your protein in-vitro for crystallization studies Bromo-acetamide is very very reactive. Most of the covalent docking libraries were designed with fairly mild electrophiles but keep this consideration in mind when selecting which library to dock and which compounds to test.
==Notes==
<references />
Back to [[:Category:Covalent]]
[[Category:Covalent]]

Latest revision as of 14:36, 29 July 2019

DOCKovalent is the covalent docking version of DOCK. Originally implemented within DOCK 3.6, it was ported over to DOCK 3.7 on March 2015. The most current publication to cite DOCKovalent is:

"Covalent docking of large libraries for the discovery of chemical probes" Nir London, Rand M Miller, Shyam Krishnan, Kenji Uchida, John J Irwin, Oliv Eidam, Lucie Gibold, Peter Cimermančič, Richard Bonnet, Brian K Shoichet & Jack Taunton. Nature Chemical Biology 10, 1066–1072 (2014) doi:10.1038/nchembio.1666

Below are instructions on various aspects related to using this version of DOCKovalent. The running example for usage would be docking of cyanoacrylamides (doubly activated Michael's acceptors) to an active site cysteine in RSK2 kinase (PDB ID: 4d9t)

General setup

  • Apply for a DOCK3.7 license:
    • go here: [dock Academic License]
    • select version DOCK3.7
    • fill in the required information
    • be patient because a real person approve the license and also get your e-mails.
    • Download the tar.gz file and copy it to your desired location on the cluster.
    • tar -zxvf [file name].tar.gz
  • You have checked out DOCK3.7.x and have it under /path/to/DOCK
    • setenv DOCKBASE /path/to/DOCK

The rest of the tutorial will assume DOCKBASE is set.

  • you need an amsol binary, you can get amsol 7.1[1] and apply a patch to make it comparable with our docking methods amsol 7 patch. Or if you are in the lab make a symbolic link to this amsol version:
 ln -s /nfs/home/londonir/code/git_trunk/DOCK/ligand/amsol/amsol7.1 .

Custom Ligand / Library Generation

To generate a single / few ligands:

  • make a directory to generate the db2 files:

mkdir lig_db2

  • make a smiles file containing the ligands you wish to DOCK. The Covalent attachment point should be marked with a [SiH3] group that would be removed during the ligand preparation. You can read more on how to generate such smiles for a specific subset and reaction scheme using the new ZINC api [link required].
  • For our example docking, this is the smiles of the crystal ligand with the covalent attachment point marked (lig.smi):

COC(=O)C(C#N)C([SiH3])c3c(c1ccc(C)cc1)c2c(N)ncnc2n3CCCO xtal-lig

cp lig.smi lig_db2

  • run the ligand generation script with a covalent flag (must be second argument)

cd lig_db2

$DOCKBASE/ligand/generate/build_smiles_ligand.sh lig.smi --covalent

The output of the script should be a lig.db2.gz file. You can step out of the db2 generation dir.

To generate a large library of ligands for covalent docking:

  • First get the smiles file of the library, where the covalent attachment point should be marked by an [SiH3] dummy atom.

Important - make sure the smiles explicitly notes [SiH3] and not just [Si]. Eventually the [SiH3] would be removed and replaced by the covalently attached atom from the protein side.

  • This command will send the library preparation scripts to the cluster (run on gimel). First argument is the smiles file, Second is how many directories you wish to make (chunks). Third is a prefix name for the dirs.

$DOCKBASE/ligand/generate/build_covalent_lib.csh example.ism 100 prefix

  • After all compounds are done, combine the db2 gz files to one directory, if everything went well, you can later delete all intermediate files and just keep the gz_files/ dir.

$DOCKBASE/ligand/generate/combine_gz_files.sh

Currently available covalent libraries (At the Shoichet lab / cluster 2) [to be added]

Protein preparation

For covalent docking we need the structure (or a model) of a protein, a crystal ligand to define the binding site, and the identity of the residue for which we want to covalently dock.

  • Download 4d9t.pdb
  • Extract the protein (e.g. grep ^ATOM 4d9t.pdb > rec.pdb)
  • Extract the crystal ligand (e..g grep "0JG A" 4d9t.pdb > xtal-lig.pdb)
  • Execute the protein preparation script while indicating the covalent attachment point, in this case CYS 436

$DOCKBASE/proteins/blastermaster/blastermaster.py --covalentResNum 436 --covalentResName CYS --covalentResAtoms HG

Note 1: the default receptor and ligand file names are rec.pdb and xtal-lig.pdb but you can specify other names using -r and -l

Note 2 if you want to dock to a Serine residue you also need to de-protonate the HG atom but for Lysine for instance you should remove all three protons: --covalentResNum 123 --covalentResName LYS --covalentResAtoms HZ1,HZ2,HZ3

The output of this script is an INDOCK file (see below), a "working" dir which contains temporary files that were generated during the preparation and a "dockfiles" dir that is required for the docking run (containing e.g. the scoring grids). One of the files that were automatically generated is dockfiles/matching_spheres.sph which for a covalent run would contain only three spheres corresponding to the three atoms in the protein preceding the covalent attachment point. E.g. for CYS these will be CA,CB,SG. Currently this is automatically generated for CYS/SER/LYS if you wish to dock to a different residue edit this file and input the coordinates of the corresponding atoms.

INDOCK parameters

By default covalent docking is turned off (weird right?)

  • Edit the INDOCK file to turn it on by changing dockovalent no -> yes
  • chemical_matching should be set to "no" as a) there is no need for matching when only sampling around the covalent bond and b) the covalent code overrides the coloring code.

Description of covalent docking related parameters:

bond_len - the 'ideal' covalent bond length

len_range - what range around the ideal length should be sampled.

len_step - at what increments should the bond length be sampled.

So. e.g. for bond_len=1.8, len_range=0.1, len_step=0.05, the following lengths will be sampled: 1.7,1.75,1.8,1.85,1.9

Note that due to steric hindrance of 1-3 interactions DOCK will most of the time prefer the longest bond length possible, for this reason, currently it's advised to use len_range=0, len_step=0.1 (to avoid division by zero) and just set the ideal length correctly. With better modeling of the covalent energy this might be fixed.

bond_ang1 - the 'ideal' bond angle for CB-SG-Lig1 (look at length explantation above for ang1_range, and ang1_step)

bond_ang2 - the 'ideal' bond angle for SG-Lig1-Lig2 (look at length explantation above for ang2_range, and ang2_step)

The default parameters for sampling (bond length and two bond angles are suitable for thioethers (e.g. when docking acrylamides to cysteines) these should be changed for different electrophile/nucleophile combinations. Some parameters you might want to use are listed below, but more work is being done in this vein as you read these lines.

  • It is also (highly) recommended to allow bumping in a covalent run, especially of the rigid part.

Set bump_rigid to 100.0 and bump_maximum to 100.0 - one can play around with these to balance speed and sampling.

  • Change check_clashes to no.

Sending a run

  • First setup the directory structure for docking:
 $DOCKBASE/docking/setup/setup_db2.csh /path/to/ligand/lig_db2/ 

tip: use FULL path to the db files (and not ../../etc)

For each .db2.gz file this will create a separate running dir

  • Setting up directory structure for docking of pre-computed ligand libraries (here we use alkylhalides as an example):
 mkdir run.alkylhalides
 cd run.alkylhalides/
 cp ../INDOCK .
 ln -s ../dockfiles/ .
 $DOCKBASE/docking/setup/setup_db2.csh path/to/ligand/gz_files/ 

Path to commercially available pre-computed ligand libraries:

Alkyl-halides:
/mnt/nfs/work/londonir/CovalentLibs/alkyl-halides/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/alkyl-halides/lead-like/gz_files/
Unsaturated carbonyls:
/mnt/nfs/work/londonir/CovalentLibs/alpha-sub-acrylate-esters/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/alpha-sub-acrylate-esters/lead-like/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/beta-sub-acrylate-esters/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/beta-sub-acrylate-esters/lead-like/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/ketone-based-enones/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/ketone-based-enones/lead-like/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/unsubstituted-acrylamides/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/unsubstituted-acrylamides/lead-like/gz_files/
Heterocyclic nitriles:
/mnt/nfs/work/londonir/CovalentLibs/heterocyclic-nitriles/fragments/gz_files/
/mnt/nfs/work/londonir/CovalentLibs/heterocyclic-nitriles/lead-like/gz_files/


  • Submit the docking job:

- locally:

 cd lig 
 $DOCKBASE/docking/DOCK/bin/dock64 

- on cluster2:

 ssh to cluster2
 cd run.alkylhalides
 $DOCKBASE/docking/submit/submit.csh 

Analysis and interpretation of the results

cd ..

  • Combine the docking results and make a scores file:

$DOCKBASE/analysis/extract_all.py

  • Extract the docked poses:

$DOCKBASE/analysis/getposes.py

Congratulations you've now successfully docked covalently. A couple of pointers for compound selection for testing:

  • Due to the ignorance of the scoring function to the covalent bond, scores tend to be higher than non covalent docking scores, even positive at times. As an example just covalently docking a methyl on to a cysteine without it clashing with anything will give a VDW score of ~+10. So positive VDW scores should not deter you from choosing what may look by eye like a good pose.
  • Different electrophiles have different inherent reactivity. This is not taken into account in any way during the docking. The docked library should be matched to the application you are interested in. If you are looking for a non-toxic compound that may be active in cells you might consider unsubstituted acrylamides that are considered mild. If on the other hand you are looking for something that can label your protein in-vitro for crystallization studies Bromo-acetamide is very very reactive. Most of the covalent docking libraries were designed with fairly mild electrophiles but keep this consideration in mind when selecting which library to dock and which compounds to test.

Notes

Back to Category:Covalent