What does DOCK do?: Difference between revisions
No edit summary |
No edit summary |
||
Line 228: | Line 228: | ||
For Amber score internally and on input of charges from a prmtop file | For Amber score internally and on input of charges from a prmtop file | ||
the charges are scaled by [http://ambermd.org/Questions/units.html 18.2223]. | the charges are scaled by [http://ambermd.org/Questions/units.html 18.2223]. | ||
[[Categeory:Need Revision]] |
Latest revision as of 23:34, 10 January 2019
The DOCK suite of programs is designed to find favorable orientations of a ligand in a “receptor.” It can be subdivided into
- those programs related directly to docking of ligands and
- accessory programs
We limit the discussion in this section to only those programs and methods related to docking a ligand in a receptor. A typical receptor might be an enzyme with a well-defined active site, though any macromolecule may be used (e.g. a structural protein, a nucleic acid strand, a “true” receptor). We’ll use an enzyme as an example in the rest of this discussion.
The starting point of all docking
calculations is
generally the crystal or NMR structure of an enzyme from an
enzyme-ligand complex. The ligand structure may be taken from the
crystal structure of the enzyme-ligand complex or from a database of
compounds, such as the ZINC database (Irwin, et. al. J. Chem. Inf. Model. 2005).
The primary consideration in the design of our docking programs has
been to develop methods which are both rapid and reasonably accurate.
These programs can be separated functionally into roughly two parts,
each somewhat independent of the other:
(i) Routines which determine the
orientation of a ligand relative to the receptor and
(ii) Routines which evaluate (score) a ligand orientation.
There is a lot of flexibility. You can generate orientations outside of DOCK and score them with the DOCK evaluation functions. Alternatively, you can develop your own scoring routines to replace the functions supplied with DOCK.
The ligand orientation in a receptor site is broken down into a series of steps, in different programs. First, a potential site of interest on the receptor is identified. (Often, the active site is the site of interest and is known a priori.) Within this site, points are identified where ligand atoms may be located. A routine from the DOCK suite of programs identifies these points, called sphere centers, by generating a set of overlapping spheres which fill the site. Rather than using DOCK to generate these sphere centers, important positions within the active site may be identified by some other mechanism and used by DOCK as sphere centers. For example, the positions of atoms from the bound ligand may be used as these sphere centers. Or, a grid may be generated within the site and each grid point may be considered as a sphere center. Our sphere centers, however, attempt to capture shape characteristics of the active site (or site of interest) with a minimum number of points and without the bias of previously known ligand binding modes.
To orient a ligand within the active
site, some of
the sphere centers are “matched” with ligand atoms.
That
is, a sphere center is “paired” with an ligand
atom. Many
sets of these atom-sphere pairs are generated, each set containing only
a small number of sphere-atom pairs. In order to limit the number of
possible sets of atom-sphere pairs, a longest distance heuristic is
used; (long) inter-sphere distances are roughly equal to the
corresponding (long) inter-atomic ligand distances. A set of
atom-sphere pairs is used to calculate an orientation of the ligand
within the site of interest. The set of sphere-atom pairs which are
used to generate an orientation is often referred to as a match. The
translation vector and rotation matrix which minimizes the rmsd of
(transformed) ligand atoms and matching sphere centers of the
sphere-atom set are calculated and used to orient the entire ligand
within the active site.
The orientation of the ligand is
evaluated with a
shape scoring function and/or a function approximating the
ligand-enzyme binding energy. Most evaluations are done on (scoring)
grids in order to minimize the overall computational time. At each grid
point, the enzyme contributions to the score are stored. That is,
receptor contributions to the score, potentially repetitive and time
consuming, are calculated only once; the appropriate terms are then
simply fetched from memory.
The ligand-enzyme binding energy is taken to be approximately the sum of the van der Waal attractive, van der Waal dispersive, and Coulombic electrostatic energies. Approximations are made to the usual molecular mechanics attractive and dispersive terms for use on a grid. To generate the energy score, the ligand atom terms are combined with the receptor terms from the nearest grid point, or combined with receptor terms from a “virtual” grid point with interpolated receptor values. The score is the sum of over all ligand atoms for these combined terms. In this case, the energy score is determined by both ligand atom types and ligand atom positions on the energy grids.
As a final step, in the energy
scoring scheme, the
orientation of the ligand may be varied slightly to minimize the energy
score. That is, after the initial orientation and evaluation (scoring)
of the ligand, a simplex minimization is used to locate the nearest
local energy minimum. The sphere centers themselves are simply
approximations to possible atom locations; the orientations generated
by the sphere-atom pairing, although reasonable, may not be minimal in
energy.
Sphere Centers
Spheres are generated to fill the target site. The sphere centers are putative ligand atom positions. Their use is an attempt to limit the enormous number of possible orientations within the active site. Like ligand atoms, these spheres touch the surface of the molecule and do not intersect the molecule. The spheres are allowed to intersect other spheres; i.e., they have volumes which overlap. Each sphere is represented by the coordinates of its center and its radius. Only the coordinates of the sphere centers are used to orient ligands within the active site (see above). Sphere radii are used in clustering.
The number of orientations of the ligand in free space is vast. The number of orientations possible from all sets of sphere-atom pairings is smaller but still large and cannot be generated and evaluated (scored) in a reasonable length of time. Consequently, various filters are used to eliminate from consideration, before evaluation, sets of sphere-atoms pairs, which will generate poorly scoring orientations. That is, only a small subset of the number of possible ligand orientations are actually generated and scored. The distance tolerance is one filter. Sphere “coloring” and identification of “critical” spheres are other filters.
Sphere-sphere distances are compared
to atom-atom
distances. Sets of sphere-atom pairs are generated in the following
manner: sphere i is
paired with atom I
if and only if for every sphere j in
the set and for every atom J
in the set,
where dij is the distance between sphere i and sphere j, dIJ is the distance between atom I and atom J, and epsilon is a somewhat small user-defined value.
Chemical Matching
DOCK spheres are generated without regard to the chemical properties of the nearby receptor atoms. Sphere “chemical matching” or “coloring” associates a chemical property to spheres and a sphere of one “color” can only be matched with a ligand atom of complementary color. These chemical properties may be things such as “hydrogen-bond donor,” “hydrogen-bond acceptor,” “hydrophobe,” “electro-positive,” “electro-negative,” “neutral,” etc. Neither the colors themselves, nor the complementarity of the colors, are determined by the DOCK suite of programs; DOCK simply uses these labels. With the inclusion of coloring, only ligand atoms with the appropriate chemical properties are matched to the complementary colored spheres. It is probably more likely, then, that the orientation generated will produce a favorable score. Conversely, by excluding colored spheres from pairing with certain ligand atoms, the number of (probably) unfavorable orientations which are generated and evaluated can be reduced. Note that requiring complementarity in matching does not mean that all ligand atoms will lie in chemically complementary regions of the enzyme. Rather, only those ligand atoms, when paired with a colored sphere which is part of the sphere-atom match, will be guaranteed to be in the chemically complementary region of the enzyme (provided chirality of the spheres is the same as that of the matching ligand atoms).
Critical Points
The "critical point" filter requires that certain spheres be part of the set of sphere-atom pairs used to orient the ligand (DesJarlais et al. J. Comput-Aided Molec. Design. 1994). Designating spheres as critical points forces the ligand to have at least one atom in that area of the enzyme, where that sphere is located. This filter may be useful, for example, when it is known that a ligand must occupy a particular area of an active site. This filter removes from consideration any orientation that does not guarantee at least one ligand atom in critical areas of the enzyme (provided chirality of the spheres is the same as that of the matching ligand atom).
Bump Filter
After a ligand is oriented within the active site, the orientation is evaluated. In an attempt to reduce the total computational time, after the ligand is oriented in the site, it is possible to first check whether or not ligand atoms occupy space already occupied by the receptor. If too many of such “bumps” are found, then the ligand is likely to intersect the receptor even after minimization; consequently, the ligand orientation is discarded before evaluation.
Units
The units of the DOCK suite of programs are lengths in angstroms, masses in atomic mass units, charges in electron charges units, and energies in kcal/mol. For Amber score internally and on input of charges from a prmtop file the charges are scaled by 18.2223.