Protein Target Preparation Updated

From DISI
Jump to navigation Jump to search

Running Blastermaster

For default blastermaster running, you need a directory with these files:

  rec.pdb
  xtal-lig.pdb

Then in that directory, run the command:

  $DOCKBASE/proteins/blastermaster/blastermaster.py --addhOptions=" -HIS -FLIPs " -v

For using thin spheres, see this wiki:

  http://wiki.docking.org/index.php/Using_thin_spheres_in_DOCK3.7

For tarting residues, see this wiki:

  http://wiki.docking.org/index.php/DOCK_3.7_tart

Checking Your Protein Preparation

written by Reed Stein, 4/3/2019

Electrostatics

The electrostatics grid used in docking is called "trim.electrostatics.phi". This file contains the electrostatic potentials (in kT/e) in and around your protein structure, by solving the Poisson-Boltzmann equation using the program QNIFFT. The grid is trimmed to fit the DOCK box (called "box" in the working directory), which is overlaid onto your binding site. The input file for QNIFFT is called qnifft.parm, which reads in the "receptor.crg.lowdielectric.pdb" file, which contains your protein and low dielectric spheres, as well as your charge file "amb.crg.oxt" and radius file "vdw.siz". Your "receptor.crg.lowdielectric.pdb" file should have spheres that look like this:

  ATOM   9008  C   SPH  9008      87.491 136.887 124.980
  TER
  ATOM   9009  C   SPH  9009      87.900 138.214 123.837
  TER
  ATOM   9010  C   SPH  9010      88.222 138.764 124.234
  TER
  ATOM   9011  C   SPH  9011      88.080 138.630 124.390

For the QNIFFT calculation, the dielectric of the protein is set to 2, while the dielectric of anything outside the protein is 80, representing water. To check whether the atoms in your protein have been assigned the correct radii/charges after running QNIFFT, open the "qnifft.atm" output file. An example line from this file looks like this:

  ATOM      1  N   MET     1      84.419 139.350 124.664  1.65 -0.5200         N

where you have "ATOM", atom number, atom name, residue name, residue number, x coordinate, y coordinate, z coordinate, atomic radius, and atomic charge. The radius and charge values are taken from the "vdw.siz" and "amb.crg.oxt" files.

If you would like to manually run QNIFFT, run the following commands:

  $DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm
  $DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi

The first command will generate the electrostatic potentials of the full protein. The second command requires the "box" file to trim the "qnifft.electrostatics.phi" to only fit inside the binding site box. This new output will be called "trim.electrostatics.phi".

To visualize your low dielectric sphere setup, open "receptor.crg.lowdielectric.pdb" in Chimera. Select all "SPH" residues and display/represent as spheres. Change the van der Waals radii of these spheres to the van der Waals "SPH" radius found in the "vdw.siz" file in your working directory. This line in the "vdw.siz" file should look like this:

   c     sph   1.90

To do this on the command line in Chimera, run the following commands:

   sel #0:SPH
   display sel
   represent sph sel
   vdwdefine 1.9 sel

The default radius is 1.90, but can be changed when scanning low dielectric sphere radii. To do this, change the radius in the "vdw.siz" file and then run QNIFFT again - see the tutorial on Parameter Scanning:

   http://wiki.docking.org/index.php/How_to_do_parameter_scanning

The actual electrostatic grids can be visualized by converting the "qnifft.electrostatics.phi" or "trim.electrostatics.phi" files into DX files for opening in Chimera. See the following tutorial for visualizing electrostatics grids:

   http://wiki.docking.org/index.php/Visualize_docking_grids

If you are happy with the way your spheres look, you can continue on to docking with them.

Ligand Desolvation

Heavy (radius = 1.8) and hydrogen (radius = 1.0) ligand desolvation grids are generated in your working directory in "heavy/" and "hydrogen/", respectively. The input file for these two separate calculations is "INSEV", which looks like this:

   rec.crg.lds.pdb  ### receptor input file
   ligand.desolv.heavy ### grid you want to generate
   1.60,1.65,1.90,1.90,1.90,1.00  ## radii for O, N, C, S, P, X (other atom type)
   1.4 ### probe radius
   2 ### grid resolution
   box ### box file - determines the extent of grids to be calculated
   1.8 ### Born radius of atom - 1.8 for heavy, 1.0 for hydrogen

By default, the "rec.crg.lds.pdb" file does not have any spheres, i.e. "ligand desolvation spheres". However, if you include ligand desolvation spheres, e.g. when parameter scanning, spheres can be included with the atom name as "X" (different from "C" as in the low dielectric spheres), as shown below:

   ATOM   9008  X   SPH  9008      87.491 136.887 124.980
   TER
   ATOM   9009  X   SPH  9009      87.900 138.214 123.837
   TER
   ATOM   9010  X   SPH  9010      88.222 138.764 124.234
   TER
   ATOM   9011  X   SPH  9011      88.080 138.630 124.390
   

The "X" radius in the "INSEV" file can be changed so that different ligand desolvation grids with different sphere radii can be generated. To do this, change the radius in the "INSEV" files for both hydrogen and heavy ligand desolvation grids, then run Solvmap for both. These spheres can be visualized (same as above with low dielectric spheres) by opening the "rec.crg.lds.pdb" file in Chimera, selecting all "SPH" residues, representing them as spheres and setting the vdW radius to the value that corresponds to "X" in the INSEV file.


Solvmap needs to be run twice to generate the heavy and hydrogen ligand desolvation grids. To run Solvmap, you need the "rec.crg.lds.pdb" and "box" files, and then run the command:

   $DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log


More advanced methods to alter your spheres, as well as increasing/decreasing charges on specific atoms can be found in these tutorials:

   http://wiki.docking.org/index.php/Using_thin_spheres_in_DOCK3.7
   http://wiki.docking.org/index.php/DOCK_3.7_tart


How Blastermaster works

written by Reed Stein 8/28/2019

This is a simplified explanation of what happens during blastermaster.

REDUCE is run on your rec.pdb to protonate the receptor:

   $DOCKBASE/proteins/Reduce/reduce -db reduce_wwPDB_het_dict.txt -HIS -FLIPs rec.pdb > rec.crg.pdb.fullh

Nonpolar hydrogens are removed (we are using a United Atom AMBER force field) and histidines are renamed according to protonation state. If your protein has disulfide bonds, these CYS residues should be renamed to CYX as well. rec.crg.pdb.fullh is renamed to:

   rec.crg.pdb

A list of binding site residues (based on your xtal-lig.pdb position) is determined using the filt program:

   $DOCKBASE/proteins/filt/bin/filt < $DOCKBASE/proteins/defaults/filt.params > filter.log

A file that lists the binding site residues is generated:

   rec.site.dms
   

This file and a copy of rec.crg.pdb (rec.crg.pdb.dms) is used as an input for generating the molecular surface of the binding site:

   $DOCKBASE/proteins/dms/bin/dms rec.crg.pdb.dms -a -d 1.0 -i rec.site.dms -g dms.log -p -n -o rec.ms

Note that the "radii" file is used to determine the radii of atoms in your protein for the DMS calculation. If some of your atoms do not have radii in that file, they need to be manually added.

The molecular surface file, rec.ms, is used as an input for SPHGEN:

   $DOCKBASE/proteins/sphgen/bin/sphgen


SPHGEN generates a file called "all_spheres.sph", which contains clusters of spheres on the surface of the receptor.

An alternative molecular surface file, "rec.ts.ms", is also used to generate thin spheres (if this flag is used)

   $DOCKBASE/proteins/thinspheres/thin_spheres.py -i rec.ts.ms -o low_die_thinspheres.sph -d 1.000000 -s 1.000000

As above, the default distance from the surface and radius for thin spheres is 1.0 Angstrom

Both "electrostatic thin spheres" and "ligand desolvation spheres" are generated using these "low_die_thinspheres.sph".

To generate matching spheres, the xtal-lig.pdb is converted to spheres:

    $DOCKBASE/proteins/pdbtosph/bin/pdbtosph xtal-lig.pdb xtal-lig.match.sph

Low dielectric spheres (when the thin sphere flag is not used) are generated using "xtal-lig.match.sph" and the output of SPHGEN, "all_spheres.sph":

    $DOCKBASE/proteins/makespheres1/makespheres1.cli.pl xtal-lig.match.sph all_spheres.sph rec.crg.pdb lowdielectric.sph {MINIMUM NUMBER OF SPHERES TO GENERATE}

More matching spheres (in addition to those from the ligand, "xtal-lig.match.sph" are taken from the SPHGEN output, "all_spheres.sph":

    $DOCKBASE/proteins/makespheres3/makespheres3.cli.pl 1.5 0.8 45 xtal-lig.match.sph all_spheres.sph rec.crg.pdb matching_spheres.sph

This gives you your matching spheres file, "matching_spheres.sph", which should include 45 matching spheres.

The DOCK box is generated using the xtal-lig matching spheres and the receptor, as well as a distance cutoff:

    $DOCKBASE/proteins/makebox/makebox.smallokay.pl xtal-lig.match.sph rec.crg.pdb box 10.0

If the thin spheres flag is not used, then the input for the QNIFFT electrostatics calculation uses the "lowdielectric.sph" file:

    cat rec.crg.pdb lowdielectric.sph.pdb > receptor.crg.lowdielectric.pdb
    $DOCKBASE/DOCK/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm

The input of QNIFFT is:

    - qnifft.parm ### input parameter file
    - amb.crg.oxt ### charge file
    - vdw.size    ### radius file 
    - receptor.crg.lowdielectric.pdb ### receptor with spheres


The vdW program, CHEMGRID, is run using the input file, INCHEM:

   $DOCKBASE/proteins/chemgrid/bin/chemgrid

This calculation requires:

   - rec.crg.pdb ### protonated receptor (United Atom AMBER Force field, no nonpolar hydrogens)
   - prot.table.ambcrg.ambH ### mapping of each atom to its corresponding van der Waals parameters
   - vdw.parms.amb.mindock ### van der Waals parameters that correspond to atoms in prot.table.ambcrg.ambH
   - box ### box that overlays the binding site

Then SOLVMAP, which calculates ligand desolvation grids, is run twice for heavy (1.8 Angstrom radius) and hydrogen (1.0 Angstrom radius) grids:

   $DOCKBASE/proteins/solvmap/bin/solvmap

These calculation require:

  - rec.crg.lds.pdb
  - ligand.desolv.heavy OR ligand.desolv.hydrogen ### name of output file
  - box ### box that overlays the binding site

The "INSEV" file specifies the radius of the probe that is used to calculate the ligand desolvation grids.

A new directory called "dockfiles" is created, and grids and matching spheres are copied into it:

 copying matching_spheres.sph into dockfiles
 copying trim.electrostatics.phi into dockfiles
 copying ligand.desolv.hydrogen into dockfiles
 copying ligand.desolv.heavy into dockfiles
 copying vdw.bmp into dockfiles
 copying vdw.vdw into dockfiles
 copying vdw.parms.amb.mindock into dockfiles
 writing INDOCK file:  INDOCK