ChEMBL errata: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
No edit summary
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This is a page to keep track of errata we've found with the ChEMBL database at [https://www.ebi.ac.uk/chembldb/]
This is a page to keep track of errata we've found with the ChEMBL database at [https://www.ebi.ac.uk/chembldb/]


These problems refer to ChEMBL11.  Problems 4-9 have been attended to in the next release of ChEMBL, expected at the end of November.


* 1.  https://www.ebi.ac.uk/chembldb/index.php/bioactivity/results/1/cmpd_chemblid/asc/tab/display


6nM in ChEMBL6uM in original paper.
* 4https://www.ebi.ac.uk/chembldb/index.php/compound/inspect/CHEMBL389012 
Annotated as an Antagonist in ChEMBL, but abstract claims it is an agonist


      ChEMBL Help - This will not be changed and I will be sending you an email in due course to let you know why, but as it stands, the data is correct.


* 2. https://www.ebi.ac.uk/chembldb/index.php/compound/inspect/CHEMBL477452


Compound SMILES/drawing are incorrect, as are many compounds from that paper
* 5. CHEMBL293146 is 0 nM for O42392 according to assay  CHEMBL818834
In fact, it is 10^-12 M in the paper (or seems to me to be so. 10^-12 is small, but it is not the same as zero.


There are hundreds of these. examples. I will email them to you


* 3. https://www.ebi.ac.uk/chembldb/doc/inspect/CHEMBL1133572  is erroneously blank.
      ChEMBL Help - the value of 1x10^-12 that you have seen in the abstract is linked to the IC50 for this compound and the 0nM is linked to its Ki. As we do not have electronic access to the full paper, we cannot check this until we can track down a paper copy. The value of 1E-12 was changed a few weeks ago and will be visible in the next release.


I got to it from this page, trying to get publication data in support of an IC50.


https://www.ebi.ac.uk/chembldb/bioactivity/results/1/cmpd_chemblid/asc/tab/display
* 6. Compound CHEMBL1288160 is annotated at -4000 nM for O96013


      ChEMBL Help - this should be % and will be changed
and  Compound CHEMBL1095696 is annotated at -14 nM for P25099
     
      ChEMBL Help - this is taken directly from the paper, which states this as the Ki, with the units nM.
There are hundreds of a couple of dozen of "negative IC50" values, below:
accession  molecule  std units  std value      doc_id    src_id
P14416 CHEMBL163087 nM -0.46 = 15817 1
P13612 CHEMBL330006 nM -0.5 = 5334 1
P13612 CHEMBL330006 nM -0.5 = 5334 1
P05556 CHEMBL330006 nM -0.5 = 5334 1
P05556 CHEMBL330006 nM -0.5 = 5334 1
P14416 CHEMBL162762 nM -0.51 = 15817 1
P20288 CHEMBL162762 nM -0.55 = 15817 1
P14416 CHEMBL162762 nM -0.55 = 15817 1
P14416 CHEMBL163087 nM -0.63 = 15817 1
P20288 CHEMBL163087 nM -0.65 = 15817 1
P13612 CHEMBL432215 nM -0.65 = 5334 1
P13612 CHEMBL432215 nM -0.65 = 5334 1
P05556 CHEMBL432215 nM -0.65 = 5334 1
P05556 CHEMBL432215 nM -0.65 = 5334 1
P35462 CHEMBL162762 nM -0.69 = 15817 1
P21917 CHEMBL162762 nM -0.69 = 15817 1
P35462 CHEMBL163087 nM -0.74 = 15817 1
P21917 CHEMBL163087 nM -0.76 = 15817 1
P25099 CHEMBL1088398 nM -1 = 51024 1
P12931 CHEMBL451544 nM -1.9e+05 = 16451 1
P12931 CHEMBL80333 nM -1.92e+05 = 16451 1
P25099 CHEMBL1087078 nM -3 = 51024 1
P35610 CHEMBL351272 nM -7.3e+04 < 2018 1
P30543 CHEMBL1095696 nM -8 = 51024 1
P30543 CHEMBL1087078 nM -10 = 51024 1
P25099 CHEMBL1087079 nM -10 = 51024 1
There are dozens more, maybe hundreds.
        ChEMBL Help: There only look to be dozens as there are duplicates in this list - there were only 12 in ChEMBL_10 and ChEMBL_11. There are now no negative IC50 values in the database.
* 7. There are four target IDs in ChEMBL that have a NULL description. That would seem to be wrong, but there could be some purpose to doing this. e.g. "we have to come back to this later..."
The TIDs are : 22226, 22222, 22224, 22228
      ChEMBL Help: The TIDs of 22222 and 22224 are for ADMET and Nucleic Acid data. These do not have specific targets, and so, cannot be given a specific Target ID (TID). 22226 and 22228 are 'Unchecked' and 'Unknown', respectively, and 'Unchecked' means that they are in a holding place, waiting to be curated further. 'Unknown' means that they have been curated but a suitable TID cannot be found due to the limited data or information that we have. I have asked the biological curator to update the descriptions on these pages to make it more obvious for users.
* 8. Functional activities with negative IC50s: (6 found)
TID    CHEMBL MOL ID    Units  number  DOC ID  SRC_ID
80166            nM    -1000 =    15748 1
81280      CHEMBL14691 nM    -4400 =    12777 1
50594      CHEMBL25847 nM    -9000 =    11189 1
22226      CHEMBL1649862    nM    -22900      =    55281 1
50594      CHEMBL188  nM    -35000      =    11189 1
50594      CHEMBL165733      nM    -40000      =    8276  1
      ChEMBL Help - the negative IC50s have now been changed, apart from 2 of them. These need to be double checked before they can be removed.
* 9. There are easily 200 entries with zero IC50s.  Here is a smattering. Full listing on request
TID    CHEMBL MOL ID    Units  number  DOC ID  SRC_ID
80532      CHEMBL314934      nM    0    =    16464 1
80444      CHEMBL68765 nM    0    =    14961 1
80444      CHEMBL66975 nM    0    =    14961 1
80390      CHEMBL314934      nM    0    =    16464 1
80295      CHEMBL82829 nM    0    =    947  1
80285      CHEMBL117611      nM    0    =    12066 1
      ChEMBL Help - whilst there are some data points that need to be changed in this list (and the one you sent me), the majority are curated 'as in the paper'. Whilst we understand that IC50s can't be 0, it is not our place to omit this information from curation if it is stated as such in the original publication.




[[Category:Errata]]
[[Category:Errata]]
[[Category:ChEMBL]]
[[Category:Obsolete]]

Latest revision as of 14:46, 21 March 2014

This is a page to keep track of errata we've found with the ChEMBL database at [1]

These problems refer to ChEMBL11. Problems 4-9 have been attended to in the next release of ChEMBL, expected at the end of November.


Annotated as an Antagonist in ChEMBL, but abstract claims it is an agonist

     ChEMBL Help - This will not be changed and I will be sending you an email in due course to let you know why, but as it stands, the data is correct.


  • 5. CHEMBL293146 is 0 nM for O42392 according to assay CHEMBL818834

In fact, it is 10^-12 M in the paper (or seems to me to be so. 10^-12 is small, but it is not the same as zero.

There are hundreds of these. examples. I will email them to you

     ChEMBL Help - the value of 1x10^-12 that you have seen in the abstract is linked to the IC50 for this compound and the 0nM is linked to its Ki. As we do not have electronic access to the full paper, we cannot check this until we can track down a paper copy. The value of 1E-12 was changed a few weeks ago and will be visible in the next release.


  • 6. Compound CHEMBL1288160 is annotated at -4000 nM for O96013
     ChEMBL Help - this should be % and will be changed

and Compound CHEMBL1095696 is annotated at -14 nM for P25099

     ChEMBL Help - this is taken directly from the paper, which states this as the Ki, with the units nM.

There are hundreds of a couple of dozen of "negative IC50" values, below:

accession molecule std units std value doc_id src_id

P14416	CHEMBL163087	nM	-0.46	=	15817	1
P13612	CHEMBL330006	nM	-0.5	=	5334	1
P13612	CHEMBL330006	nM	-0.5	=	5334	1
P05556	CHEMBL330006	nM	-0.5	=	5334	1
P05556	CHEMBL330006	nM	-0.5	=	5334	1
P14416	CHEMBL162762	nM	-0.51	=	15817	1
P20288	CHEMBL162762	nM	-0.55	=	15817	1
P14416	CHEMBL162762	nM	-0.55	=	15817	1
P14416	CHEMBL163087	nM	-0.63	=	15817	1
P20288	CHEMBL163087	nM	-0.65	=	15817	1
P13612	CHEMBL432215	nM	-0.65	=	5334	1
P13612	CHEMBL432215	nM	-0.65	=	5334	1
P05556	CHEMBL432215	nM	-0.65	=	5334	1
P05556	CHEMBL432215	nM	-0.65	=	5334	1
P35462	CHEMBL162762	nM	-0.69	=	15817	1
P21917	CHEMBL162762	nM	-0.69	=	15817	1
P35462	CHEMBL163087	nM	-0.74	=	15817	1
P21917	CHEMBL163087	nM	-0.76	=	15817	1
P25099	CHEMBL1088398	nM	-1	=	51024	1
P12931	CHEMBL451544	nM	-1.9e+05	=	16451	1
P12931	CHEMBL80333	nM	-1.92e+05	=	16451	1
P25099	CHEMBL1087078	nM	-3	=	51024	1
P35610	CHEMBL351272	nM	-7.3e+04	<	2018	1
P30543	CHEMBL1095696	nM	-8	=	51024	1
P30543	CHEMBL1087078	nM	-10	=	51024	1
P25099	CHEMBL1087079	nM	-10	=	51024	1

There are dozens more, maybe hundreds.

       ChEMBL Help: There only look to be dozens as there are duplicates in this list - there were only 12 in ChEMBL_10 and ChEMBL_11. There are now no negative IC50 values in the database.


  • 7. There are four target IDs in ChEMBL that have a NULL description. That would seem to be wrong, but there could be some purpose to doing this. e.g. "we have to come back to this later..."

The TIDs are : 22226, 22222, 22224, 22228

     ChEMBL Help: The TIDs of 22222 and 22224 are for ADMET and Nucleic Acid data. These do not have specific targets, and so, cannot be given a specific Target ID (TID). 22226 and 22228 are 'Unchecked' and 'Unknown', respectively, and 'Unchecked' means that they are in a holding place, waiting to be curated further. 'Unknown' means that they have been curated but a suitable TID cannot be found due to the limited data or information that we have. I have asked the biological curator to update the descriptions on these pages to make it more obvious for users.


  • 8. Functional activities with negative IC50s: (6 found)
TID    CHEMBL MOL ID     Units   number  DOC ID   SRC_ID
80166            nM    -1000 =     15748 1
81280      CHEMBL14691 nM    -4400 =     12777 1
50594      CHEMBL25847 nM    -9000 =     11189 1
22226      CHEMBL1649862     nM    -22900      =     55281 1
50594      CHEMBL188   nM    -35000      =     11189 1
50594      CHEMBL165733      nM    -40000      =     8276  1
      ChEMBL Help - the negative IC50s have now been changed, apart from 2 of them. These need to be double checked before they can be removed.


  • 9. There are easily 200 entries with zero IC50s. Here is a smattering. Full listing on request
TID    CHEMBL MOL ID     Units   number  DOC ID   SRC_ID
80532      CHEMBL314934      nM    0     =     16464 1
80444      CHEMBL68765 nM    0     =     14961 1
80444      CHEMBL66975 nM    0     =     14961 1
80390      CHEMBL314934      nM    0     =     16464 1
80295      CHEMBL82829 nM    0     =     947   1
80285      CHEMBL117611      nM    0     =     12066 1
     ChEMBL Help - whilst there are some data points that need to be changed in this list (and the one you sent me), the majority are curated 'as in the paper'. Whilst we understand that IC50s can't be 0, it is not our place to omit this information from curation if it is stated as such in the original publication.