Tool2

From DISI
Revision as of 23:31, 10 March 2014 by Frodo (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Question

What is the most similar compound from any one catalog to my molecule?

This could be any single catalog in ZINC. Currently you must specify the short name of the catalog you wish to use, and there is no error message for typos.

  • Human Metabolome (hmdb)
  • Biocyc (biocyc)
  • KEGG (kegg, keggd, keggviapc)
  • Drugbank approved (dbap)
  • Sigma Aldrich (sialbb, sial)

Approach

(contact John for website)

Example Use and Interpretation

You must give SMILES and the short code for the catalog you wish to search. By default the tanimoto cutoff is 0.4, and 0.3 is the minimum we support, because the performance falls off a cliff at low tanimoto. The fingerprints are "new stardard ZINC" which is ECFP4, 1024 bits, rdkit.

For instance, what is the closest metabolite in HMDB to haloperidol? Approach:

  • get SMILES for haloperidol (e.g. from ZINC)
  • get code for HMDB (hmdb)
  • go to (secret website)
  • enter these things
  • adjust cutoff from 0.4 to 0.3
  • click GO

the answser is loperimide http://zinc.docking.org/substance/537928

HMMMM, loperimide is a metabolite, you say? yes, according to http://www.hmdb.ca/metabolites/HMDB04999 search for the word "endogenous" and hover your mouse. Yes, yes, I think they are wrong too, since the "justification" they give is about oral dosing of loperamide. Still, weird, huh?


Limitations

  • currently slower than should be. should be 3 s, is currently often 20s. We know.
  • compounds in original catalog may have been filtered out by ZINC loading rules
  • YYZ catalogs are slightly dated compared to UCSF. They will be updated soon.

Why is it like this?

The reason this does not work correctly in ZINC12 is that we cut an important corner during the similarity search to make it pragmatic. In Toronto, we've uncut this corner so we can give you a definitive answer, modulo the compounds that get filtered out during ZINC loading.