DOCKovalent cysteine inhibitor design tutorial: Difference between revisions

From DISI
Jump to navigation Jump to search
(Created page with " This was written on April 4, 2018. This tutorial is for designing linkers for a covalent inhibitor and is supplement the work in preparation (Wan et al 2018). These file ...")
 
No edit summary
(16 intermediate revisions by the same user not shown)
Line 11: Line 11:
1/Custom Ligand / Library Generation
1/Custom Ligand / Library Generation


cd 1-Custom-Ligand-Library-Generation
cd 1-Custom-Ligand-Library-Generation


For meta-SF library building
library 1:  search the acrylamide in the ZINC15 database (ask John to put your builded library in ZINC15)
login into http://zinc15.docking.org/patterns/home/, search acrylamide in pattern
found 1: Acrylamide-Terminal, [CD1]=[CD2]-C(=O)-[NX3]    Purchase is 84576
File:    /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/acrylamide/library1-70587-ZINC15-acrylamide-library.smi
db2 file in  /mnt/nfs/ex9/work/xiaobo/new_covalent_lib/acrylamides/lib1


reaction1: scaffod.smi is the smile of the scaffold for reaction
library 2: aldehyde-based-cyanoacrylamides
817.zinc.list.smi is the smile of collect 817 different diamine linkers
Search the aldehyde from ZINC15, and only-single-aldehyde-aromatic-for-sale+bb.smi 145960
one step synthesis
        python aldehyde-to-cyanoacryl.py only-single-aldehyde-aromatic-for-sale+bb.smi
        python step3-remove_doubles.py reaction_nocorina_out.ism
File:
        /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/aldehyde-based-cyanoacrylamides/145956-aldehyde-based-cyanoacrylamides.smi
db2 file  in /mnt/nfs/ex7/work/xiaobo/new_covalent_lib/2017-6-8-cyanoacrylamide
 
library 3: ~184,900 Enamine acids + Boc-diamine + acrylic acid library
filter3-selected_acids_2150.smi the most common 2150 Enamine acids fro Enamine
83-Boc_diamines.smi  the most common 83 Boc from Enamine
two step synthesis
python step1-SN1-diamines-CO2H.py filter3-selected_acids_2150.smi 83-Boc_diamines.smi
python step2-reaction-acrylic-acid.py in.smi
pyton step3-remove_doubles.py in2.smi
final file: /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/acids-Boc-acrylic-acid/final-acids-Boc-acrylic-acid.smi    184900
db2 file in  /mnt/nfs/ex7/work/xiaobo/2017-6-30-acids-Boc-acrylic-acid/acd1
              /mnt/nfs/ex7/work/xiaobo/2017-6-30-acids-Boc-acrylic-acid/acd2
 
library 4: ~145,508 Sulfonyl Chloride + Boc-diamine + acrylaic acid library
filter4-1677.sulfonyl_chlorides.smi the most common 1677 Enamine sulfonyl_chlorides fro Enamine
83-Boc_diamines.smi the most common 83 Boc from Enamine
two step synthesis
python SN1-diamines-CO2H.py filter4-1677.sulfonyl_chlorides.smi 83-Boc_diamines.smi
python step2-reaction-acrylic-acid.py in.smi
pyton step3-remove_doubles.py in2.smi
final file :  /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/sulfonyl_chloride_Boc-acrylic-acid/final-sulfonyl_chloride_Boc-acrylic-acid.smi
db2 file in  /mnt/nfs/ex7/work/xiaobo/2017-6-30-sulfonyl_chloride_Boc-acrylic-acid/suc1
              /mnt/nfs/ex7/work/xiaobo/2017-6-30-sulfonyl_chloride_Boc-acrylic-acid/suc2
 
for a single smile (single-smile-generation) addSiH3-to-dimethylamino-acrylamide
python addSiH3-to-dimethylamino-acrylamide.py input-ligand.smi
python step3-remove_doubles.py addSiH3.smi
file file: /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/single-smile-generation/no_doubles_out.ism


python step1-reaction-amines-Br.py scaffod.smi 817.zinc.list.smi
                     
reaction2:
python step2-reaction-SF-meta.py scaffold.ism
                        input file: scaffold.ism is the primary product without the SF
                        output file: final_scaffold1.smi


reaction3:  remove the doubles
python step3-remove_doubles.py final_scaffold1.smi
                        inputfile:final_scaffold1.smi
                        outputfile: no_doubles_out.ism


The no_double_out.ism was used to generate db2 file for covalent docking
The no_double_out.ism was used to generate db2 file for covalent docking
Line 36: Line 63:
  /nfs/soft/tools/utils/qsub-slice/qsub-mr-meta -tc 50 --map-instance-script "/nfs/soft/tools/utils/qsub-slice/qsub-mr-map.sh" -s $BUILD_ENVIRONMENT -l 1 no_doubles_out.ism $DOCKBASE/ligand/generate/build_database_ligand.sh --no-db --no-solv --no-mol2 --single --covalent
  /nfs/soft/tools/utils/qsub-slice/qsub-mr-meta -tc 50 --map-instance-script "/nfs/soft/tools/utils/qsub-slice/qsub-mr-map.sh" -s $BUILD_ENVIRONMENT -l 1 no_doubles_out.ism $DOCKBASE/ligand/generate/build_database_ligand.sh --no-db --no-solv --no-mol2 --single --covalent


==Step 2 Protein preparation (different lysine rotamers) ==
==Step 2 Protein preparation (different cysteine rotamers) ==
2/Protein preparation (different lysine rotamers)
2/Protein preparation (different lysine rotamers)
  cd 2-Protein-preparation-different-lysine-rotamers
  cd 2-Protein-preparation-different-cys-rotamers
find the modification lys number in the PDB
find the modification cys number in the PDB
  echo "5K9I-B-X44     B       295">>lys.list
  echo "4iqy-A-AR6     A       104">>cys.list
  bash step0_prepare_build_system.sh  5K9I-B-X44
  bash step0_prepare_build_system.sh  4iqy-A-AR6
In the window of chimera, select all of the 27 lysine rotamers and click the button of OK. Reselect all the lysine rotamers in the PDB structure, and the save to PDB format LYS-5K9I-B-X44.pdb
In the window of chimera, select all of the 3 cysteine rotamers and click the button of OK. Reselect all the lysine rotamers in the PDB structure, and the save to PDB format CYS-4iqy-A-AR6.pdb
Then, to generate all 28 structure folds, and then automatically calculate the steric clash with nearby residues, and select the rotamer with no steric clashes. This script will also calculate the nearest atom of in the compound to the lysine NZ atom
Then, to generate all 3 structure folds, and then automatically calculate the steric clash with nearby residues, and select the rotamer with no steric clashes. This script will also calculate the nearest atom of in the compound to the lysine NZ atom
  bash step1_run_build_system.sh 5K9I-B-X44
  bash ../step1_run_build_system.sh 4iqy-A-AR6
  results
  results
  5K9E-B-X44      SBH     2.038
  4iqA-A-AR6     0 contacts
  5K9B-B-X44      SBH     2.321
  4iqB-A-AR6     0 contacts
  5K9I-B-X44      OBI     2.949
  4iqC-A-AR6     1 contacts
  5K9L-B-X44      SBH     4.683
  4iqA-A-AR6    O3'     16.951
  5K9R-B-X44      OBI     4.925
  4iqB-A-AR6     O3'    16.951
Each folder contains rec.pdb and xtal-lig.pdb
Each folder contains rec.pdb and xtal-lig.pdb


For each folder
For each folder
  bash step1_DOCKINV.blastermaster.sh 5K9I-B-X44 box_margin(10) 1(covalent docking)
  bash step1_DOCKINV.blastermaster.sh 4iqA-A-AR6  box_margin(6) 1(covalent docking)
box_margin is defined from the center of the xtal-lig.pdb file
box_margin is defined from the center of the xtal-lig.pdb file


Line 62: Line 89:
change the default parameters for covalent docking
change the default parameters for covalent docking


  bump_maximum                  100
   bump_rigid                   1000000000000.0
   bump_rigid                   100
   number_save                  100
   number_save                  1000
   number_write                  100
   number_write                  1000
   molecules_maximum            100000
   molecules_maximum            100000
   bond_len                      1.61
  electrostatic_scale          1.0
   bond_ang1                    121.02
  vdw_scale                    1.0
   bond_ang2                    107.36
   bond_len                      1.77
   bond_ang1                    124.18
   bond_ang2                    120.84
   len_range                    0.0
   len_range                    0.0
   len_step                      0.1
   len_step                      0.1
Line 78: Line 106:
   check_clashes                no
   check_clashes                no
   per_atom_scores              yes
   per_atom_scores              yes


==Step 4  run the covalent docking in gimel==
==Step 4  run the covalent docking in gimel==
  cd 4-run-the-covalent-docking
  cd 4-run-the-covalent-docking


contain a pharmacophore filter ( exclusion criteria that ligands should form hydrogen bonds with the kinase hinge region, and the shared pyrimidine 3-aminopyrazole scaffold should be within 2 Å compared to the crystal conformation)
contain a pharmacophore filter (exclusion criteria that ligands should form hydrogen bonds with the protein, and the ligand should form one hydrogen bond with protein)


Prepare  
Prepare
  1)the modified INDOCK file INDOCK.bump1000000000000.pose1000.20.5.5
  1)the modified INDOCK file INDOCK.bump1000000000000.pose1000.20.5.5
  2)the gate residue file (define the two residue in the SRC kinase domain MET341  VAL399)
  2)the gate residue file (define the covalent modified cys in this file )
 
Define in the file qsub_fix-pipeline-for-dock-and-filter.sh
 
  scriptsdir=
 
  ligdir=
 
Input file :
Input file :
  1) the list different structure folders (5K9A-B-X44,5K9A-C-X44)
  1) the list different structure folders (4iqA-A-AR6)
  2) the ligand library folder name (lib1)
  2) the ligand library folder name (lib1)
3) the linker name list (lib1.list)
   bash qsub_fix-pipeline-for-dock-and-filter.sh run.list lib1
   bash /mnt/nfs/home/xiaobo/UCSF_scripts/2018-4-3-covlanet_lysine_cys_wiki-tutorial/4-run-the-covalent-docking/qsub_multipe_jobs structure-list lib1 lib1.list


==Step 5 Analysis and combine the top1 pose from different structures==
==Step 5 Analysis and combine the top1 pose from different structures==
 
 
   cd 5-Analysis-and-combine-the-top1-poses-from-different-structures
   cd 5-Analysis-and-combine-the-top1-poses-from-different-structures
after the covalent docking, analyze the docking results
after the covalent docking, analyze the docking results


   bash step1_extract_the_best_score.sh structure-list lib1  lib1.list
   bash step2-1-combine-check-job.sh lib1
       Inputfile :
       Input file :
       1) the list different structure folders (5K9A-B-X44,5K9A-C-X44)
       1) the ligand library folder name (lib1)
      2) the ligand library folder name (lib1)
      3) the linker name list (lib1.list)


combine the docking results
extract the docking poses (you can also use your own scripts to process your data)
 
        bash step3_combine-best-energy.sh structure-list lib1


        bash step2-3-rank-poses.sh lib1
         Input file :
         Input file :
         1) the list different structure folders (5K9A-B-X44,5K9A-C-X44)
         1) the ligand library folder name (lib1)
        2) the ligand library folder name (lib1)
        Output file:
        1)sort.final.combine-new.aura-A-X63.list.dat    rank all of the top1 pose for each linker
        2)submit.new.aura-A-X63.list.dat                the name of structure file and linker


==Step 6 Run the minimization and MM/GBSA rescoreing==
==Step 6 Run the minimization and MM/GBSA rescoreing==


  cd 6-Run-the-minimization-and-MMGBSA-rescoring
  cd 6-Run-the-minimization-and-MMGBSA-rescoring
First, check the protonation state of each linker after when using the chimera to add hydrogen
second, the different H position of linkers will result in the different labelling number of the attached NH of lysine residue
prepare the list for each linker containing two informations in XO44.charge.list file (default:xabs    1      1)


   bash step7_fix_prolem_resubmit_MMPBSA.minimization.sh INDOCK.bump1000000000000.pose1000.20.5.5-xo4E-A-X44-X44-meta-xaaa-1-mini_end_GB
First, extract the each pose
INDOCK.bump1000000000000.pose1000.20.5.5-xo4E-A-X44-X44-meta-xaaa-1-mini_end_GB is the folder for runing minimization
 
perl step1-split-poses.pl uniq.analysis.hqVA-M-ASF.dat.pdb
 
 
the protonation state of each linker after when using the chimera to add hydrogen
prepare the list for each linker containing charge information (default:0)
 
   bash qsub_run_automatic_pipeline_for_amber_minimization.sh INDOCK.bump1000000000000.pose1000.20.5.5-C000032628502-1-5 0
  INDOCK.bump1000000000000.pose1000.20.5.5-C000032628502-1-is the folder for runing minimization


after minimization, then run the AMBER MMGBSA rescoring
after minimization, then run the AMBER MMGBSA rescoring
Line 145: Line 179:
   save the linker viewdock state: P
   save the linker viewdock state: P
   perl step0-filter_by_the_chimera.pl pdb  to extract the final poses
   perl step0-filter_by_the_chimera.pl pdb  to extract the final poses
==Step 8 8-pose-benchmark-systems==
From paper 1 the https://code.google.com/archive/p/covalentdock/downloads
76 systems (have't tested yet)
From Schrondinger covalent datasets 38 systems

Revision as of 01:02, 19 July 2018

This was written on April 4, 2018.

This tutorial is for designing linkers for a covalent inhibitor and is supplement the work in preparation (Wan et al 2018).


These file are in the /mnt/nfs/home/xiaobo/UCSF_scripts/2018-4-3-covlanet_lysine_wiki-tutorial


Step 1. Custom Ligand and Library Generation

1/Custom Ligand / Library Generation

cd 1-Custom-Ligand-Library-Generation

library 1:  search the acrylamide in the ZINC15 database (ask John to put your builded library in ZINC15)
login into http://zinc15.docking.org/patterns/home/, search acrylamide in pattern
found 1: Acrylamide-Terminal, [CD1]=[CD2]-C(=O)-[NX3]    Purchase is 84576
File:    /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/acrylamide/library1-70587-ZINC15-acrylamide-library.smi
db2 file in  /mnt/nfs/ex9/work/xiaobo/new_covalent_lib/acrylamides/lib1
library 2: aldehyde-based-cyanoacrylamides
Search the aldehyde from ZINC15, and only-single-aldehyde-aromatic-for-sale+bb.smi 145960
one step synthesis
       python aldehyde-to-cyanoacryl.py only-single-aldehyde-aromatic-for-sale+bb.smi
       python step3-remove_doubles.py reaction_nocorina_out.ism
File:
       /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/aldehyde-based-cyanoacrylamides/145956-aldehyde-based-cyanoacrylamides.smi
db2 file  in /mnt/nfs/ex7/work/xiaobo/new_covalent_lib/2017-6-8-cyanoacrylamide
library 3: ~184,900 Enamine acids + Boc-diamine + acrylic acid library
filter3-selected_acids_2150.smi the most common 2150 Enamine acids fro Enamine
83-Boc_diamines.smi  the most common 83 Boc from Enamine
two step synthesis
python step1-SN1-diamines-CO2H.py filter3-selected_acids_2150.smi 83-Boc_diamines.smi
python step2-reaction-acrylic-acid.py in.smi
pyton step3-remove_doubles.py in2.smi
final file: /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/acids-Boc-acrylic-acid/final-acids-Boc-acrylic-acid.smi    184900
db2 file in  /mnt/nfs/ex7/work/xiaobo/2017-6-30-acids-Boc-acrylic-acid/acd1
             /mnt/nfs/ex7/work/xiaobo/2017-6-30-acids-Boc-acrylic-acid/acd2
library 4: ~145,508 Sulfonyl Chloride + Boc-diamine + acrylaic acid library
filter4-1677.sulfonyl_chlorides.smi the most common 1677 Enamine sulfonyl_chlorides fro Enamine
83-Boc_diamines.smi  the most common 83 Boc from Enamine
two step synthesis
python SN1-diamines-CO2H.py filter4-1677.sulfonyl_chlorides.smi 83-Boc_diamines.smi
python step2-reaction-acrylic-acid.py in.smi
pyton step3-remove_doubles.py in2.smi
final file :  /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/sulfonyl_chloride_Boc-acrylic-acid/final-sulfonyl_chloride_Boc-acrylic-acid.smi
db2 file in  /mnt/nfs/ex7/work/xiaobo/2017-6-30-sulfonyl_chloride_Boc-acrylic-acid/suc1
             /mnt/nfs/ex7/work/xiaobo/2017-6-30-sulfonyl_chloride_Boc-acrylic-acid/suc2
for a single smile (single-smile-generation) addSiH3-to-dimethylamino-acrylamide
python addSiH3-to-dimethylamino-acrylamide.py input-ligand.smi
python step3-remove_doubles.py addSiH3.smi
file file: /nfs/home/xiaobo/UCSF_scripts/2018-7-17-covalent_cys_wiki-tutorial/1-Custom-Ligand-Library-Generation/single-smile-generation/no_doubles_out.ism


The no_double_out.ism was used to generate db2 file for covalent docking

log into gimel
setenv DOCKBASE /mnt/nfs/home/xiaobo/combine_docknormal_dock_covalent_3.7_and_tart/DOCK_from_githup_2016_5_27
setenv DOCKBASE /mnt/nfs/home/xiaobo/combine_docknormal_dock_covalent_3.7_and_tart/DOCK_from_githup_2016_5_27
/nfs/soft/tools/utils/qsub-slice/qsub-mr-meta -tc 50 --map-instance-script "/nfs/soft/tools/utils/qsub-slice/qsub-mr-map.sh" -s $BUILD_ENVIRONMENT -l 1 no_doubles_out.ism $DOCKBASE/ligand/generate/build_database_ligand.sh --no-db --no-solv --no-mol2 --single --covalent

Step 2 Protein preparation (different cysteine rotamers)

2/Protein preparation (different lysine rotamers)

cd 2-Protein-preparation-different-cys-rotamers

find the modification cys number in the PDB

echo "4iqy-A-AR6      A       104">>cys.list
bash step0_prepare_build_system.sh  4iqy-A-AR6

In the window of chimera, select all of the 3 cysteine rotamers and click the button of OK. Reselect all the lysine rotamers in the PDB structure, and the save to PDB format CYS-4iqy-A-AR6.pdb Then, to generate all 3 structure folds, and then automatically calculate the steric clash with nearby residues, and select the rotamer with no steric clashes. This script will also calculate the nearest atom of in the compound to the lysine NZ atom

bash ../step1_run_build_system.sh 4iqy-A-AR6
results
4iqA-A-AR6     0 contacts
4iqB-A-AR6     0 contacts
4iqC-A-AR6     1 contacts
4iqA-A-AR6     O3'     16.951
4iqB-A-AR6     O3'     16.951
Each folder contains rec.pdb and xtal-lig.pdb

For each folder

bash step1_DOCKINV.blastermaster.sh 4iqA-A-AR6  box_margin(6) 1(covalent docking)

box_margin is defined from the center of the xtal-lig.pdb file

Step 3 modify the INDOCK parameters for saving multiple poses

cd 3-modify-the-INDOCK-parameters

change the default parameters for covalent docking

 bump_rigid                   1000000000000.0
 number_save                   100
 number_write                  100
 molecules_maximum             100000
 electrostatic_scale           1.0
 vdw_scale                     1.0
 bond_len                      1.77
 bond_ang1                     124.18
 bond_ang2                     120.84
 len_range                     0.0
 len_step                      0.1
 ang1_range                    20.0
 ang2_range                    20.0
 ang1_step                     5
 ang2_step                     5
 check_clashes                 no
 per_atom_scores               yes


Step 4 run the covalent docking in gimel

cd 4-run-the-covalent-docking

contain a pharmacophore filter (exclusion criteria that ligands should form hydrogen bonds with the protein, and the ligand should form one hydrogen bond with protein)

Prepare

1)the modified INDOCK file INDOCK.bump1000000000000.pose1000.20.5.5
2)the gate residue file (define the covalent modified cys in this file )

Define in the file qsub_fix-pipeline-for-dock-and-filter.sh

 scriptsdir=
 ligdir=

Input file :

1) the list different structure folders (4iqA-A-AR6)
2) the ligand library folder name (lib1)
 bash qsub_fix-pipeline-for-dock-and-filter.sh run.list lib1

Step 5 Analysis and combine the top1 pose from different structures

 cd 5-Analysis-and-combine-the-top1-poses-from-different-structures

after the covalent docking, analyze the docking results

 bash step2-1-combine-check-job.sh lib1
      Input file :
      1) the ligand library folder name (lib1)

extract the docking poses (you can also use your own scripts to process your data)

       bash step2-3-rank-poses.sh lib1
       Input file :
       1) the ligand library folder name (lib1)

Step 6 Run the minimization and MM/GBSA rescoreing

  cd 6-Run-the-minimization-and-MMGBSA-rescoring

First, extract the each pose

perl step1-split-poses.pl uniq.analysis.hqVA-M-ASF.dat.pdb


the protonation state of each linker after when using the chimera to add hydrogen prepare the list for each linker containing charge information (default:0)

 bash qsub_run_automatic_pipeline_for_amber_minimization.sh INDOCK.bump1000000000000.pose1000.20.5.5-C000032628502-1-5 0
 INDOCK.bump1000000000000.pose1000.20.5.5-C000032628502-1-5  is the folder for runing minimization

after minimization, then run the AMBER MMGBSA rescoring bash step10_fix_prolem_resubmit_MMPBSA_score.sh INDOCK.bump1000000000000.pose1000.20.5.5-xo4E-A-X44-X44-meta-xaaa-1-mini_end_GB

extract the scoring number for each linker

 bash step6_resubmit.extract_GBscore.sh list

the list contains (INDOCK.bump1000000000000.pose1000.20.5.5-xo4E-A-X44-X44-meta-xaaa-1-mini_end_GB)

Step 7 analyze the final pose by chimera

 cd 7-analyze-the-final-pose-by-chimera

first sort the linker according to the MMGBSA score

 cat MMGBSA.list | sort -nk 2 >sort.MMGBSA.list
 1-extract the pose without the protein
 perl fix-step3_extract_best_score_combinepdb_after_minimize.pl sort.MMGBSA.list
 2-extract the pose with the protein
 perl fix-step4_extract_best_score_combinepdb_after_minimize_with_rec.pl sort.MMGBSA.list

using the chimera to visualize these poses and select the final linker (save to PDB file)

 save the linker viewdock state: P
 perl step0-filter_by_the_chimera.pl pdb  to extract the final poses

Step 8 8-pose-benchmark-systems

From paper 1 the https://code.google.com/archive/p/covalentdock/downloads

76 systems (have't tested yet)

From Schrondinger covalent datasets 38 systems