ZINC:Problems: Difference between revisions

From DISI
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 5: Line 5:
Other problem pages are available:  [[DOCK Blaster:Problems]], [[DUD:Problems]],  [[DOCK:Problems]],  [[Problems]].
Other problem pages are available:  [[DOCK Blaster:Problems]], [[DUD:Problems]],  [[DOCK:Problems]],  [[Problems]].


= Non-exhaustive list of problems =
= Non-exhaustive list of problems we are aware of =
* Molecule duplication
* Molecule duplication
* Incorrect representation
* Incorrect representation
* Missing representation
* Missing representation
* Out of date catalogs
* Missing catalog
* Missing molecule
* Out of date catalog
* Broken molecules
* Broken molecules
* Probably incorrect enumeration/sampling of stereochemistry both R/S and E/Z
* Probably incorrect enumeration/sampling of stereochemistry both R/S and E/Z
* incorrect treatment of protonation and tautomerization in some cases
* incorrect treatment of protonation and tautomerization in some cases
* search way too slow
* search way too slow
* SEA index out of date
* problem with links to supplier web sites
* activity annotations out of date
* links to supplier web sites sometimes broken
* tutorials
* tutorials
* protocols
* protocols
* describe the pipeline on-line
* get the new paper out
* yuck out of date.
* missing molecules


== Known Problems ==
= More details about problems we are aware of =
I'm so glad you asked! There are a number of problems we know of, all of which we aim to fix one day. We hope you will agree that the benefits of ZINC as it stands outweigh the problems. Here are a few of the problems we are aware of:


* Unreasonable tautomers - We generate some tautomers that we shouldn't. Among the ones we know about is CH3-C=NH -> CH2=C-N. Over half (40K+) were removed March 7. More processing to follow in April 2005. (Problem 5/3/7)
* Unreasonable tautomers: We generate some tautomers that we shouldn't. Among the ones we know about is CH3-C=NH -> CH2=C-N.
* Aggressive protonation - We generate protonated forms that are probably unreasonable for most targets, such as protonated pyridines. This is an active area of research. Please be patient. (Problem 5/995)
* Broken flexibase molecules - If you use the flexibase format files, we are aware of a number of broken molecules, including C1S(=O)(=O)CCC1 and molecules with aliphatic rigid fragments. We know about this, and are working to correct it. (Problem 5/996)
* Corrupt files to download - We offer over 30 million distinct files to download from the ZINC web site. Our quality control is currently such that a few of these are corrupt. If you find one, would you kindly bring it to our attention? We will endeavor to fix it asap. (Problem 5/997)
* Subsets & Uploads - The subsetting and uploading mechanism is somewhat brittle. We hope to spend time on this soon. (Problem 5/999)
* Truncated Searches - We currently limit searches to 30 seconds of CPU time, to avoid overwhelming our servers, and to give you at least a partial answer in a timely fashion. We plan to add more servers soon and thereby offer quicker turnaround and more complete answers. Thanks for your patience. (Problem 5/998) [correction: We are at 45 seconds on a trial basis as of Feb 14.]
* Duplicate backslashes in SMILES files - Fixed 3 March 2005.(Problem 5/1/3)
* Name of molecule in mol2 file often incorrect. Being fixed currently. Reported by Gandhimathi and Federica Morandi. We consider this an annoying but not a core bug.
* Ambiguous and syntatically incorrect SMILES for E/Z specification Being fixed currently. Full solution expected in April. (Problem 5/3/10).
* Wrong annotations - FIXED March 1, 2005. If you find incorrect annotations in files downloaded after March 1, 2005, please write databases at docking.org. (Problem 5/3/1)
* Wrong molecules in subset - Subsets 1&2 FIXED March 1, 2005. Other subsets being released March 3-5. If you find molecules that do not belong in a subset for files downloaded AFTER March 5, 2005, please write comments at docking.org. (Problem 5/3/2)
* Wrong charges in mol2 files - The current version of ZINC contains MMFF94 charges rather than AMSOL charges. We regret this error. The workaround is to run mol2 files through a program like molcharge, part of the QuacPac suite from OpenEye. There are many other fine programs that will assign partial atomic charges. The new version of ZINC now in preparation will have AMSOL partial atomic charges.  


== Missing Molecules ==
* Aggressive protonation:  We generate protonated forms that are probably unreasonable for most targets, such as protonated pyridines. This is an active area of research. Please be patient.


* Truncated Searches: Search results currently limited to 500 (anonymous) or 5000 (logged in).  We are working towards a solution (expected Fall 2012).
* Duplication: We are aware of a thousands of duplicate molecules in ZINC.  We continue to work on a robust solution to this problem.


[[Category:ZINC]]
[[Category:Problems]]
[[Category:Problems]]

Revision as of 01:51, 23 February 2012

Problems with individual molecular representations (e.g. broken molecules), questions about the interface, non catastrophic errors should be reported in our Support Forum. Emergencies (e.g. the site is down, no search will work) should be reported to Zinc Support.

This is the ZINC problems page, which describes all the problems specific to ZINC. Other problem pages are available: DOCK Blaster:Problems, DUD:Problems, DOCK:Problems, Problems.

Non-exhaustive list of problems we are aware of

  • Molecule duplication
  • Incorrect representation
  • Missing representation
  • Missing catalog
  • Missing molecule
  • Out of date catalog
  • Broken molecules
  • Probably incorrect enumeration/sampling of stereochemistry both R/S and E/Z
  • incorrect treatment of protonation and tautomerization in some cases
  • search way too slow
  • problem with links to supplier web sites
  • tutorials
  • protocols

More details about problems we are aware of

  • Unreasonable tautomers: We generate some tautomers that we shouldn't. Among the ones we know about is CH3-C=NH -> CH2=C-N.
  • Aggressive protonation: We generate protonated forms that are probably unreasonable for most targets, such as protonated pyridines. This is an active area of research. Please be patient.
  • Truncated Searches: Search results currently limited to 500 (anonymous) or 5000 (logged in). We are working towards a solution (expected Fall 2012).
  • Duplication: We are aware of a thousands of duplicate molecules in ZINC. We continue to work on a robust solution to this problem.