Decoy Theory: Difference between revisions

From DISI
Jump to navigation Jump to search
mNo edit summary
 
(11 intermediate revisions by 3 users not shown)
Line 5: Line 5:
"Decoys" is essentially a codename for a way of evaluating how well your docking program has done on a target (or a set of targets). Decoys refers to a set of molecules that (probably) won't bind to your target. Here are some terms:
"Decoys" is essentially a codename for a way of evaluating how well your docking program has done on a target (or a set of targets). Decoys refers to a set of molecules that (probably) won't bind to your target. Here are some terms:


*[[Ligands]]: A set of known ligands that bind to your protein target. Often taken from papers or a database like ChEMBL [https://www.ebi.ac.uk/chembldb/]
*[[Ligands]]: A set of known ligands that bind to your protein target. Often taken from papers or a database like [http://www.ebi.ac.uk/chembldb/ ChEMBL]


*Known Decoys/ Known non-binders: A set of molecules that have been tested against your protein target and found not to bind. ChEMBL or the literature is also a source of these.
*Known Decoys/ Known non-binders: A set of molecules that have been tested against your protein target and found not to bind. ChEMBL or the literature is also a source of these.


*Property-Matched Decoys: A set of molecules, typically from [[ZINC]], that look like your ligands in chemical and physical property, but are not similar to the ligands by Tanimoto of a 2D fingerprint. Making these is explained here [[Automated_Database_Preparation#Automatic_Decoy_Generation]] and more information can be found in the Huang et al or Verdonk et al papers listed below.
*Property-Matched Decoys: A set of molecules, typically from [[ZINC]], that look like your ligands in chemical and physical property, but are not similar to the ligands by Tanimoto of a 2D fingerprint. Making these is explained here [[Automated_Database_Preparation#Automatic_Decoy_Generation]] and more information can be found in the Huang et al <ref>[http://pubs.acs.org/doi/abs/10.1021/jm0608356 Huang N, Shoichet BK, Irwin JJ. Benchmarking sets for molecular docking. J Med Chem. 2006 Nov 16; 49(23):6789-801.]</ref>
or Verdonk et al <ref>[https://pubs.acs.org/doi/full/10.1021/ci034289q Marcel L. Verdonk*, Valerio Berdini, Michael J. Hartshorn, Wijnand T. M. Mooij, Christopher W. Murray, Richard D. Taylor, and Paul Watson. J. Chem. Inf. Comput. Sci., 2004, 44 (3), pp 793–806.]</ref> papers.


*Random Decoys: A set of random molecules, usually chosen from [[ZINC]], most of which won't be binders simply by chance.
*Random Decoys: A set of random molecules, usually chosen from [[ZINC]], most of which won't be binders simply by chance.


Now, once you have these various sets, you can examine the enrichment of various sets, usually by looking at a ROC curve, log ROC curve, or the [LogAUC] of one set over the others. The most common usage is ligands over property-matched decoys. If your target does well at this, the general attitude is that you will do well at a prospective virtual screen. If you have many known decoys or known non-binders you can examine the enrichment of those over ligands, which also tends to indicate how well you are doing. Sometimes it is also illustrative to test the enrichment of ligands over random decoys (usually something like the leadlike ZINC subset, trimmed at 60% Tanimoto overlap [[http://zinc.docking.org/subset1/]]. A final thing to examine if you have known decoys is to examine their enrichment over random decoys, expecting random performance.  
Now, once you have these various sets, you can examine the enrichment of various sets, usually by looking at a ROC curve, log ROC curve, or the [[LogAUC]] of one set over the others. The most common usage is ligands over property-matched decoys. If your target does well at this, the general attitude is that you will do well at a prospective virtual screen. If you have many known decoys or known non-binders you can examine the enrichment of those over ligands, which also tends to indicate how well you are doing. Sometimes it is also illustrative to test the enrichment of ligands over random decoys (usually something like the leadlike ZINC subset, trimmed at 60% Tanimoto overlap <ref>[http://zinc.docking.org/subset1/]</ref>. A final thing to examine if you have known decoys is to examine their enrichment over random decoys, expecting random performance.


== References ==
== References ==
 
<references />
Huang, Irwin & Shoichet. [http://dx.doi.org/10.1021/jm0608356].
 
Verdonk et al [http://pubs.acs.org/doi/full/10.1021/ci034289q]


== More relevant pages ==
== More relevant pages ==
Line 30: Line 28:




[[Category:DOCK]] [[Category:DOCK:Theory]] [[Category:DUD]]
[[Category:DOCK]]
[[Category:Theory]]
[[Category:DUD]]
[[Category:DUDE]]
[[Category:Decoys]]

Latest revision as of 23:39, 3 January 2019

What are decoys?

Or, more importantly, why do we care about decoys and keep hearing about decoys all the time?

"Decoys" is essentially a codename for a way of evaluating how well your docking program has done on a target (or a set of targets). Decoys refers to a set of molecules that (probably) won't bind to your target. Here are some terms:

  • Ligands: A set of known ligands that bind to your protein target. Often taken from papers or a database like ChEMBL
  • Known Decoys/ Known non-binders: A set of molecules that have been tested against your protein target and found not to bind. ChEMBL or the literature is also a source of these.
  • Property-Matched Decoys: A set of molecules, typically from ZINC, that look like your ligands in chemical and physical property, but are not similar to the ligands by Tanimoto of a 2D fingerprint. Making these is explained here Automated_Database_Preparation#Automatic_Decoy_Generation and more information can be found in the Huang et al [1]

or Verdonk et al [2] papers.

  • Random Decoys: A set of random molecules, usually chosen from ZINC, most of which won't be binders simply by chance.

Now, once you have these various sets, you can examine the enrichment of various sets, usually by looking at a ROC curve, log ROC curve, or the LogAUC of one set over the others. The most common usage is ligands over property-matched decoys. If your target does well at this, the general attitude is that you will do well at a prospective virtual screen. If you have many known decoys or known non-binders you can examine the enrichment of those over ligands, which also tends to indicate how well you are doing. Sometimes it is also illustrative to test the enrichment of ligands over random decoys (usually something like the leadlike ZINC subset, trimmed at 60% Tanimoto overlap [3]. A final thing to examine if you have known decoys is to examine their enrichment over random decoys, expecting random performance.

References

More relevant pages

Decoys, DUD, LogAUC


Categories