DOCK Blaster:Preliminaries

From DISI
Jump to navigation Jump to search

Docking requires three things: a target structure, a database, and a docking program. DOCK Blaster uses DOCK 3.5.54 as the docking program and various ZINC subsets as the database. It is up to you to choose a target structure for docking. You will also need to have some idea of the binding site you wish to target, as proteins often contain more than one eligible site. You may also have additional information such as actives and inactives, which can be helpful for assessing docking performance and viability.

What is the question?

The most common use of docking is to answer the question:

  • 1) What compounds should I purchase to test for activity against my protein?

Many people will also want to know:

  • 2) Are the docking results worth spending time and money testing?

Ideally, we would like to answer question 2 first. DOCK Blaster's approach is to collect available control information, in the form of bound ligands, actives and inactives that may be reported in the literature, databases or patents, or known from other sources. DOCK Blaster performs a preliminary docking study which attempts to recapitulate known experimental information. If it cannot do this, it does not necessairily mean that docking "does not work". There may be good mitigating circumstances. However, it should raise doubt in your mind if docking cannot re-discover what you already know.

Here we consider some of the most typical scenarios for a docking project, and attempt to point out what you should keep in mind before you start docking. Remember, this is research! So be sure to do controls whenever you can, and remain skeptical!

This is a good time to remind you that docking is just one of many techniques for ligand discovery. In particular, if actives against a target are already known, the simplest way may be to simply look for analogs and derivatives via ligand based methods.

What do I know?

To use docking you need to have a structure of the target. Let us consider the possible scenarios.

Only one crystal structure, good quality, with ligand bound

This is a nice situation to be in. Your target is in a ligand bound conformation, and your ligand is a valuable control with which to assess performance. Although there are many reasons why docking may still be problematic, you at least have a good chance of being able to evaluate how well the docking program performs, and whether it can be expected to predictive.

We have extensively benchmarked DOCK Blaster against just this scenario, using structures in the PDB where we could automatically find a single organic small molecule bound to a protein.

There are still lots of things you can worry about (see below), and preliminary docking may well show that control information cannot be reproduced. Still, this is a great situation to be in, and we suggest you try "black box docking" directly, to see whether the results look interesting.

Only one crystal structure, good quality, no ligand bound

This is often a close runner up to having a ligand bound crystal structure, as in many cases the induced fit due to ligand binding is minor.

It may be worth looking for crystal structures of highly similar targets, in case one has a ligand bound. That could help give an indication of the amount of induced fit in the target that might be expected on ligand binding.

A major drawback to this scenario is the lack of a crystallographic control ligand. It may be worthwhile trying to model in a ligand, if you know one.

No good crystal structure available

A good crystal structure is generally one that has 2.5 A resolution or better, and no serious problems (e.g. as reported by PROCHECK). In the absence of a good crystal structure, you may still be able to start docking using another kind of target structure.

Comparative model

Comparative models vary widely in their usefullness for docking. This is a topic of considerable current interest as well as recent and forthcoming publications. Generally, the higher the identity of the target to the scaffold the more reliable the model will be, although the movement of a single amino acid side chain in the binding site may be enough to cause problems. For this reason, pay attention to amino acid substitutions in the binding site. If several scaffolds are available, consider the sequence identity in the binding site as well as the overall sequence identity as a figure of merit for selecting the best model for docking.

We recommend, provisionally, the use of SCWRL as a proof-reading tool for comparative models.

We recommend the use of MODBASE and MODELER as a source of homology models.

Pay particular attention to binding sites involving metals.

NMR structure

NMR structures can and have been useful for docking, particularly if the binding site is fairly rigid. Ligand controls (discussed extensively on this site) are crucial for assessing the performance and viability of docking.

Poor quality crystal structure

Whereas we do not recommend the use of structures worse than 2.5 A, if that is all that is available, it may be worth a try. Again, test using ligand controls - if available.

Cannot acquire a structure of the target

You cannot start docking without a 3D atomic model of your target. Ask colleages for advice. Search the PDB, use ModBase. If you cannot acquire a satisfactory model, you will need to use a technique other than docking for ligand discovery.

More than one good crystal structure available

You are both lucky and cursed, because whereas you have more information, it may be unclear which model to use. Here are some things to think about:

  • Consider superimposing the structures to look for variability in the binding site.

Picking the "most representative" structure may be the way to go.

  • If ligand controls are available, consider docking to all the targets, and using the preliminary docking results to select the model that most effectively recapitulates the controls.
  • Consider both the overall resolution and the B factors of atoms in the binding site when considering which model is best.
  • Focusing on the binding site, look for
    • side chain movement
    • conserved water structure
  • You may be able to use atoms from multiple bound ligands as "hot spots".

Additional information =

Actives and Inactives

Several sources of information:

  • positive controls (actives)
  • negative controls (inactives)

Special knowledge

  • pH at which the structure was solved / is biologically relevant

What should I worry about?

Is docking going to work for me?