ChEMBL errata

From DISI
Jump to navigation Jump to search

This is a page to keep track of errata we've found with the ChEMBL database at [1]

These problems refer to ChEMBL11. Problems 4-9 have been attended to in the next release of ChEMBL, expected at the end of November.


Annotated as an Antagonist in ChEMBL, but abstract claims it is an agonist

     ChEMBL Help - This will not be changed and I will be sending you an email in due course to let you know why, but as it stands, the data is correct.


  • 5. CHEMBL293146 is 0 nM for O42392 according to assay CHEMBL818834

In fact, it is 10^-12 M in the paper (or seems to me to be so. 10^-12 is small, but it is not the same as zero.

There are hundreds of these. examples. I will email them to you

     ChEMBL Help - the value of 1x10^-12 that you have seen in the abstract is linked to the IC50 for this compound and the 0nM is linked to its Ki. As we do not have electronic access to the full paper, we cannot check this until we can track down a paper copy. The value of 1E-12 was changed a few weeks ago and will be visible in the next release.


  • 6. Compound CHEMBL1288160 is annotated at -4000 nM for O96013
     ChEMBL Help - this should be % and will be changed

and Compound CHEMBL1095696 is annotated at -14 nM for P25099

     ChEMBL Help - this is taken directly from the paper, which states this as the Ki, with the units nM.

There are hundreds of a couple of dozen of "negative IC50" values, below:

accession molecule std units std value doc_id src_id

P14416	CHEMBL163087	nM	-0.46	=	15817	1
P13612	CHEMBL330006	nM	-0.5	=	5334	1
P13612	CHEMBL330006	nM	-0.5	=	5334	1
P05556	CHEMBL330006	nM	-0.5	=	5334	1
P05556	CHEMBL330006	nM	-0.5	=	5334	1
P14416	CHEMBL162762	nM	-0.51	=	15817	1
P20288	CHEMBL162762	nM	-0.55	=	15817	1
P14416	CHEMBL162762	nM	-0.55	=	15817	1
P14416	CHEMBL163087	nM	-0.63	=	15817	1
P20288	CHEMBL163087	nM	-0.65	=	15817	1
P13612	CHEMBL432215	nM	-0.65	=	5334	1
P13612	CHEMBL432215	nM	-0.65	=	5334	1
P05556	CHEMBL432215	nM	-0.65	=	5334	1
P05556	CHEMBL432215	nM	-0.65	=	5334	1
P35462	CHEMBL162762	nM	-0.69	=	15817	1
P21917	CHEMBL162762	nM	-0.69	=	15817	1
P35462	CHEMBL163087	nM	-0.74	=	15817	1
P21917	CHEMBL163087	nM	-0.76	=	15817	1
P25099	CHEMBL1088398	nM	-1	=	51024	1
P12931	CHEMBL451544	nM	-1.9e+05	=	16451	1
P12931	CHEMBL80333	nM	-1.92e+05	=	16451	1
P25099	CHEMBL1087078	nM	-3	=	51024	1
P35610	CHEMBL351272	nM	-7.3e+04	<	2018	1
P30543	CHEMBL1095696	nM	-8	=	51024	1
P30543	CHEMBL1087078	nM	-10	=	51024	1
P25099	CHEMBL1087079	nM	-10	=	51024	1

There are dozens more, maybe hundreds.

       ChEMBL Help: There only look to be dozens as there are duplicates in this list - there were only 12 in ChEMBL_10 and ChEMBL_11. There are now no negative IC50 values in the database.


  • 7. There are four target IDs in ChEMBL that have a NULL description. That would seem to be wrong, but there could be some purpose to doing this. e.g. "we have to come back to this later..."

The TIDs are : 22226, 22222, 22224, 22228

     ChEMBL Help: The TIDs of 22222 and 22224 are for ADMET and Nucleic Acid data. These do not have specific targets, and so, cannot be given a specific Target ID (TID). 22226 and 22228 are 'Unchecked' and 'Unknown', respectively, and 'Unchecked' means that they are in a holding place, waiting to be curated further. 'Unknown' means that they have been curated but a suitable TID cannot be found due to the limited data or information that we have. I have asked the biological curator to update the descriptions on these pages to make it more obvious for users.


  • 8. Functional activities with negative IC50s: (6 found)
TID    CHEMBL MOL ID     Units   number  DOC ID   SRC_ID
80166            nM    -1000 =     15748 1
81280      CHEMBL14691 nM    -4400 =     12777 1
50594      CHEMBL25847 nM    -9000 =     11189 1
22226      CHEMBL1649862     nM    -22900      =     55281 1
50594      CHEMBL188   nM    -35000      =     11189 1
50594      CHEMBL165733      nM    -40000      =     8276  1
      ChEMBL Help - the negative IC50s have now been changed, apart from 2 of them. These need to be double checked before they can be removed.


  • 9. There are easily 200 entries with zero IC50s. Here is a smattering. Full listing on request
TID    CHEMBL MOL ID     Units   number  DOC ID   SRC_ID
80532      CHEMBL314934      nM    0     =     16464 1
80444      CHEMBL68765 nM    0     =     14961 1
80444      CHEMBL66975 nM    0     =     14961 1
80390      CHEMBL314934      nM    0     =     16464 1
80295      CHEMBL82829 nM    0     =     947   1
80285      CHEMBL117611      nM    0     =     12066 1
     ChEMBL Help - whilst there are some data points that need to be changed in this list (and the one you sent me), the majority are curated 'as in the paper'. Whilst we understand that IC50s can't be 0, it is not our place to omit this information from curation if it is stated as such in the original publication.