ZINC15:current limitations: Difference between revisions

From DISI
Jump to navigation Jump to search
(sdf)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
Some things in ZINC15 are still not right.  We are working as fast as we can.
Some things in ZINC15 are still not right.  We are working as fast as we can.
If any of these is impacting you, please write us and we will see what we can do to prioritize it.
If any of these is impacting you, please write us and we will see what we can do to prioritize it.
 
{{TOCright}}
== Incomplete loading ==
== Incomplete loading ==
* We know that not all rings are loaded yet.  We are about 85% complete. Estimate completion in October 2015.
* We know that not all rings are loaded yet.  We are about 90% complete. Estimate completion in October 2015.
* We do not have 3D models for every molecule in ZINC15.  We estimate we will have 50% coverage of lead-like and 25% coverage of drug-like by Jan 1, 2016.  
* We do not have 3D models for every molecule in ZINC15.  We estimate we will have 50% coverage of lead-like and 25% coverage of drug-like by Jan 1, 2016. The work around is to use ZINC 12 (zinc.docking.org) until ZINC15 3D loading is ready.  
* Catalog loading is correct as of August 2015. We have a backlog that will be caught up in October 2015 post release.
* Catalog updates are correct as of mid August 2015. We have a backlog that will be caught up by end October 2015.
* We know there is a problem with Sigma Aldrich purchasing codes. Oct 15th.
* There are various problems with the ChEMBL and SEA loading, which will be fixed in October 2015.
* There are various problems with the ChEMBL and SEA loading, which will be fixed in October 2015.


Line 15: Line 16:
== Incomplete curation ==
== Incomplete curation ==
* We know that we have not curated all patterns correctly yet. By incomplete curation, we mean that the patterns have not yet been correctly assigned to the "reactivity" axis. As a result, some compounds are currently incorrectly classified on the reactivity axis.  We are aware of this, we are working on it, and we hope to have better curation available in October 2015.
* We know that we have not curated all patterns correctly yet. By incomplete curation, we mean that the patterns have not yet been correctly assigned to the "reactivity" axis. As a result, some compounds are currently incorrectly classified on the reactivity axis.  We are aware of this, we are working on it, and we hope to have better curation available in October 2015.
* We know the codes for Sigma Aldrich and Tocris need to be corrected.
* We know that some of the links to vendors need updating. If you tell us what they are (via discus, bottom of page) we will update them more quickly.
Surprisingly, no SEA predictions even though it is a drug.
* metformin 12859773  - has no SEA predictions and no targets < 10uM per ChEMBL 20.
Stereochemistry screwed up for this one
* http://zinc15.docking.org/catalogs/chembl20/items/?supplier_code=CHEMBL124754
* errors in HMDB and Drugbank. report them 15 at a time.
* ambiguous stereochem.
broken?
http://zinc15.docking.org/substances/subsets/purchasable/?mwt-lt=350&structure-contains=[OD1][C@H]1[C@@H](O)[C@H](O[C@H](O)[C@@H]1O)C(=O)O
curcumene - not in Zinc. why?
1304 - no sea prediction. why?
ITP - axial equatorial of corina vs jchem molconvert.
protonation of histidine
protonation of benzimidazole
protonation of alkyl and aryl thiol
treatment of NO2 and N adjacent to NO2 or nitrile C#N (deprotonated)


== Visual SMARTS handling ==
SMARTS handling
* We know that we are not displaying the SMARTS patterns correctly.
* We know that the JSME editor is not currently handling SMARTS correctly, and that this is our problem not theirs.  
* We know that the JSME editor is not currently handling SMARTS correctly, and that this is our problem not theirs.  


== Incomplete GUI implementation  (see also [[ZINC15:Status]]) ==
== Incomplete GUI implementation  (see also [[ZINC15:Status]]) ==
* no GUI-based method to create "having" queries
* no GUI-based method to create "having" queries
* missing /home, /examples and other endpoints
* some detail endpoints (majorclass, subclass, organism) still missing.
* some pages continue to be simplistic
* some pages (gene detail, ortholog detail, catalog detail)  continue to be simplistic
* Batch mode is not yet working.  We estimate December 2015 for a beta version.
* Batch mode is not yet working.  We estimate December 2015 for a beta version.



Latest revision as of 14:52, 10 October 2015

Some things in ZINC15 are still not right. We are working as fast as we can. If any of these is impacting you, please write us and we will see what we can do to prioritize it.

Incomplete loading

  • We know that not all rings are loaded yet. We are about 90% complete. Estimate completion in October 2015.
  • We do not have 3D models for every molecule in ZINC15. We estimate we will have 50% coverage of lead-like and 25% coverage of drug-like by Jan 1, 2016. The work around is to use ZINC 12 (zinc.docking.org) until ZINC15 3D loading is ready.
  • Catalog updates are correct as of mid August 2015. We have a backlog that will be caught up by end October 2015.
  • We know there is a problem with Sigma Aldrich purchasing codes. Oct 15th.
  • There are various problems with the ChEMBL and SEA loading, which will be fixed in October 2015.

Timeout problems

  • We are aware of timeout problems on long running queries. We have a solution for this expected Dec 2015. In the meantime, there are workarounds. Ask us.
  • We know about performance problems under heavy load. We have more hardware standing by. Expect to deploy (doubling website performance) in November 2015.


Incomplete curation

  • We know that we have not curated all patterns correctly yet. By incomplete curation, we mean that the patterns have not yet been correctly assigned to the "reactivity" axis. As a result, some compounds are currently incorrectly classified on the reactivity axis. We are aware of this, we are working on it, and we hope to have better curation available in October 2015.
  • We know the codes for Sigma Aldrich and Tocris need to be corrected.
  • We know that some of the links to vendors need updating. If you tell us what they are (via discus, bottom of page) we will update them more quickly.

Surprisingly, no SEA predictions even though it is a drug.

  • metformin 12859773 - has no SEA predictions and no targets < 10uM per ChEMBL 20.

Stereochemistry screwed up for this one

  • errors in HMDB and Drugbank. report them 15 at a time.
  • ambiguous stereochem.

broken?

http://zinc15.docking.org/substances/subsets/purchasable/?mwt-lt=350&structure-contains=[OD1][C@H]1[C@@H](O)[C@H](O[C@H](O)[C@@H]1O)C(=O)O

curcumene - not in Zinc. why?

1304 - no sea prediction. why?

ITP - axial equatorial of corina vs jchem molconvert.


protonation of histidine

protonation of benzimidazole

protonation of alkyl and aryl thiol

treatment of NO2 and N adjacent to NO2 or nitrile C#N (deprotonated)

SMARTS handling

  • We know that the JSME editor is not currently handling SMARTS correctly, and that this is our problem not theirs.

Incomplete GUI implementation (see also ZINC15:Status)

  • no GUI-based method to create "having" queries
  • some detail endpoints (majorclass, subclass, organism) still missing.
  • some pages (gene detail, ortholog detail, catalog detail) continue to be simplistic
  • Batch mode is not yet working. We estimate December 2015 for a beta version.

detailed reports of problems to be investigated

  • Trent says: here is a mol that does not work for me in db2 generation:
c1ccc2c(c1)[nH]c3scc[n+]23      ZINC01648614
  • Nir says: b.t.w. this is an example of one that historically failed (if you're looking for a test case):
ZINC000034474796