WINC annotations

From DISI
Jump to navigation Jump to search

WINC Annotations are currently generated by SEA.

WINC is Currently Version 0. ( August 2008).


These annotations may be found, browsed, and downloaded at http://zinc.docking.org/winc0/

Description

We ran 8.5 million molecules in ZINC (8.1 currently purchasable) against the MDDR and WOMBAT databases using the SEA method. This resulted in nearly 14 M suggestions of possible activity having P-values better than 10^-10, which are accessible from this page.

The items in the table are sorted by number of suggested molecules per annotation. The columns are as follows:

  • Source annotation ID : these are the codes used in the MDDR and Wombat. Click on this link to see the molecules in the ZINC results browser. The implementation here is weak, but you should still be able to see some of the molecules.
  • SEA-suggested activity. These are the text descriptions corresponding to the codes in the first column. From Wombat, trailing - means antagonist, + means agonist, no character means not specified or not relevant.
  • Number of annotations. This is the number of ZINC molecules with P-value better than 10^-10 for this annotation.
  • WINC #: This is just a number we use to keep track of each annotation internally, and is the name of the SMILES file containing all the suggestions that you can get by clicking on this link.

How to use

Suppose you seek new 5 HT1D Antagonists. Use the search feature of your browser to find the corresponding row in the web page, and download the SMILES. View the SMILES on your local computer. Pick compounds you like, enter the ZINC ID (in the SMILES file) into ZINC to find how to purchase the compound. Note, about 5% of the suggestions in WINC are not purchasable. For instance, some of them are actual drugs, that provide useful controls on the calculation.

You may also wish to browse compounds on-line before downloading.

Questions

Q1. Some of these compounds cannot possibly be ligands as suggested. What is going on?

A1. SEA is a warhead-naive method, and also does not consult physical properties. It only considers topological similarity to annotated ligands. Thus SEA currently makes (and WINC contains) many suggestions that are obvious nonsense. We are working on this, but for now we suggest you just ignore suggestions that are obviously wrong and move on.

Q2. SEA / WINC seems to have issues with a) peptides, b) long linear molecules, c) diterpenes / steroids. What is going on?

A1. Since SEA uses topological fingerprints, it inherits some of the well known weaknesses of that approach. This includes failure to discriminate in areas of chemical space with highly similar and redundant subgraphs. This is an area of research. We suggest you just ignore it when you see it and move on to something else.

Q3. I have a question not answered here.

A1. We encourage you to write us at support at docking.org with technical problems with the website.

--John Irwin