ZINC:Problems: Difference between revisions

From DISI
Jump to navigation Jump to search
mNo edit summary
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This is the ZINC problems page, which describes all the problems specific to [[ZINC]]. There are other problems pages:
Problems with individual molecular representations (e.g. broken molecules), questions about the interface,  
non catastrophic errors should be reported in our [http://docking.org/forum/14 Support Forum]. Emergencies (e.g. the site is down, no search will work) should be reported to [mailto:support@docking.org Zinc Support].


* [[DOCK Blaster:Problems]]
This is the ZINC problems page, which describes all the problems specific to [[ZINC]].
* [[DUD:Problems]]
Other problem pages are available:  [[DOCK Blaster:Problems]], [[DUD:Problems]][[DOCK:Problems]][[Problems]].
* [[DOCK:Problems]]
* [[Problems]] - all other problems go here.  


== Problem 1 ==
In this paper
http://www.sciencedirect.com/science/article/pii/S0968089609001114?np=y
It is sub nanomolar
http://zinc15.docking.org/genes/ADRA2B/observations/?zinc_id=ZINC000100006770
But the compound in ZINC (and likely ChEMBL) is not the right one
Have not yet written to ChEMBL


== Problem 2 - Sept 2016 ==
Incorrect imidazole with =C= traced back to PDB
Wrote to RCSB
They sent it along to EBI
Have not heard back
== Problem 3  - Aug 2016 ==
Wrote to biosynth vendor about an error
Not sure if we heard back
Deleted wrong molecule from ZINC
== Problem 4 - Sept 29 ==
roglizitizone is not a AGTR1 agonist
http://zinc15.docking.org/substances/ZINC000000968330/
Have not told ChEMBL yet.
Paper is suspicious
http://link.springer.com/article/10.1007%2Fs00044-008-9152-x
== Problem 5 - Oct 4 ==
http://zinc15.docking.org/substances/ZINC000112987563/
is a radical. comes from MolPort. Wrote to them on Oct 4.
== Problem 6 - Sept 29 ==
= Non-exhaustive list of problems we are aware of =
* Molecule duplication
* Molecule duplication
* Incorrect representation
* Incorrect representation
* Missing representation
* Missing representation
* Missing catalog
* Missing molecule
* Out of date catalogs
* Out of date catalogs
* Broken molecules
 
== broken molecules ==
 
C04476765
C05925570
C04914001
C00315585
C02483697
C05477751
C67328881
C21032807
C10809385
C00238298
C20580044
C10809385
 
 
* Probably incorrect enumeration/sampling of stereochemistry both R/S and E/Z
* Probably incorrect enumeration/sampling of stereochemistry both R/S and E/Z
* incorrect treatment of protonation and tautomerization in some cases
* incorrect treatment of protonation and tautomerization in some cases
* search way too slow
* search way too slow
* SEA index out of date
* problem with links to supplier web sites
* activity annotations out of date
* links to supplier web sites sometimes broken
* tutorials
* tutorials
* protocols
* protocols
* describe the pipeline on-line
* get the new paper out
* yuck out of date.
* missing molecules


== Known Problems ==
= More details about problems we are aware of =
I'm so glad you asked! There are a number of problems we know of, all of which we aim to fix one day. We hope you will agree that the benefits of ZINC as it stands outweigh the problems. Here are a few of the problems we are aware of:
 
* Unreasonable tautomers: We generate some tautomers that we shouldn't. Among the ones we know about is CH3-C=NH -> CH2=C-N.
 
* Aggressive protonation:  We generate protonated forms that are probably unreasonable for most targets, such as protonated pyridines. This is an active area of research. Please be patient.
 
* Truncated Searches: Search results currently limited to 500 (anonymous) or 5000 (logged in). We are working towards a solution (expected Fall 2012).
 
* Duplication: We are aware of a thousands of duplicate molecules in ZINC.  We continue to work on a robust solution to this problem.
 


* Unreasonable tautomers - We generate some tautomers that we shouldn't. Among the ones we know about is CH3-C=NH -> CH2=C-N. Over half (40K+) were removed March 7. More processing to follow in April 2005. (Problem 5/3/7)
* duplicates:
* Aggressive protonation - We generate protonated forms that are probably unreasonable for most targets, such as protonated pyridines. This is an active area of research. Please be patient. (Problem 5/995)
http://zinc.docking.org/substance/13952160
* Broken flexibase molecules - If you use the flexibase format files, we are aware of a number of broken molecules, including C1S(=O)(=O)CCC1 and molecules with aliphatic rigid fragments. We know about this, and are working to correct it. (Problem 5/996)
http://zinc.docking.org/substance/4798508
* Corrupt files to download - We offer over 30 million distinct files to download from the ZINC web site. Our quality control is currently such that a few of these are corrupt. If you find one, would you kindly bring it to our attention? We will endeavor to fix it asap. (Problem 5/997)
* Subsets & Uploads - The subsetting and uploading mechanism is somewhat brittle. We hope to spend time on this soon. (Problem 5/999)
* Truncated Searches - We currently limit searches to 30 seconds of CPU time, to avoid overwhelming our servers, and to give you at least a partial answer in a timely fashion. We plan to add more servers soon and thereby offer quicker turnaround and more complete answers. Thanks for your patience. (Problem 5/998) [correction: We are at 45 seconds on a trial basis as of Feb 14.]
* Duplicate backslashes in SMILES files - Fixed 3 March 2005.(Problem 5/1/3)
* Name of molecule in mol2 file often incorrect. Being fixed currently. Reported by Gandhimathi and Federica Morandi. We consider this an annoying but not a core bug.
* Ambiguous and syntatically incorrect SMILES for E/Z specification Being fixed currently. Full solution expected in April. (Problem 5/3/10).
* Wrong annotations - FIXED March 1, 2005. If you find incorrect annotations in files downloaded after March 1, 2005, please write databases at docking.org. (Problem 5/3/1)
* Wrong molecules in subset - Subsets 1&2 FIXED March 1, 2005. Other subsets being released March 3-5. If you find molecules that do not belong in a subset for files downloaded AFTER March 5, 2005, please write comments at docking.org. (Problem 5/3/2)
* Wrong charges in mol2 files - The current version of ZINC contains MMFF94 charges rather than AMSOL charges. We regret this error. The workaround is to run mol2 files through a program like molcharge, part of the QuacPac suite from OpenEye. There are many other fine programs that will assign partial atomic charges. The new version of ZINC now in preparation will have AMSOL partial atomic charges.


== Missing Molecules ==
There are literally thousands of these. We regret this problem, and will fix it one day. For now, please try to ignore it.




[[Category:Problems]]
[[Category:Problems]]

Latest revision as of 21:06, 4 October 2016

Problems with individual molecular representations (e.g. broken molecules), questions about the interface, non catastrophic errors should be reported in our Support Forum. Emergencies (e.g. the site is down, no search will work) should be reported to Zinc Support.

This is the ZINC problems page, which describes all the problems specific to ZINC. Other problem pages are available: DOCK Blaster:Problems, DUD:Problems, DOCK:Problems, Problems.

Problem 1

In this paper

http://www.sciencedirect.com/science/article/pii/S0968089609001114?np=y

It is sub nanomolar

http://zinc15.docking.org/genes/ADRA2B/observations/?zinc_id=ZINC000100006770

But the compound in ZINC (and likely ChEMBL) is not the right one

Have not yet written to ChEMBL

Problem 2 - Sept 2016

Incorrect imidazole with =C= traced back to PDB

Wrote to RCSB

They sent it along to EBI

Have not heard back

Problem 3 - Aug 2016

Wrote to biosynth vendor about an error

Not sure if we heard back

Deleted wrong molecule from ZINC

Problem 4 - Sept 29

roglizitizone is not a AGTR1 agonist

http://zinc15.docking.org/substances/ZINC000000968330/

Have not told ChEMBL yet.

Paper is suspicious

http://link.springer.com/article/10.1007%2Fs00044-008-9152-x

Problem 5 - Oct 4

http://zinc15.docking.org/substances/ZINC000112987563/

is a radical. comes from MolPort. Wrote to them on Oct 4.

Problem 6 - Sept 29

Non-exhaustive list of problems we are aware of

  • Molecule duplication
  • Incorrect representation
  • Missing representation
  • Missing catalog
  • Missing molecule
  • Out of date catalogs

broken molecules

C04476765 C05925570 C04914001 C00315585 C02483697 C05477751 C67328881 C21032807 C10809385 C00238298 C20580044 C10809385


  • Probably incorrect enumeration/sampling of stereochemistry both R/S and E/Z
  • incorrect treatment of protonation and tautomerization in some cases
  • search way too slow
  • problem with links to supplier web sites
  • tutorials
  • protocols

More details about problems we are aware of

  • Unreasonable tautomers: We generate some tautomers that we shouldn't. Among the ones we know about is CH3-C=NH -> CH2=C-N.
  • Aggressive protonation: We generate protonated forms that are probably unreasonable for most targets, such as protonated pyridines. This is an active area of research. Please be patient.
  • Truncated Searches: Search results currently limited to 500 (anonymous) or 5000 (logged in). We are working towards a solution (expected Fall 2012).
  • Duplication: We are aware of a thousands of duplicate molecules in ZINC. We continue to work on a robust solution to this problem.


  • duplicates:
http://zinc.docking.org/substance/13952160
http://zinc.docking.org/substance/4798508

There are literally thousands of these. We regret this problem, and will fix it one day. For now, please try to ignore it.